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Streptococcus pyogenes Antigens 

The present invention relates to isolated nucleic acid molecules, which encode antigens for Streptococcus 
pyogenes, which are suitable for use in preparation of pharmaceutical medicaments for the prevention and 
treatment of bacterial infections caused by Streptococcus pyogenes. 

Streptococcus pyogenes, also called group A streptococci (GAS), is an important gram-positive extracellular 
bacterial pathogen and commonly infects humans. GAS colonize the tliroat or skin and are responsible 
for a number of suppurative infections and non-suppurative sequelae. It is primarily a disease of children 
and causes a variety of infections including bacterial pharyngitis, scarlet fever, impetigo and sepsis in 
humans. Decades of epidemiological studies have led to the concept of distinct throat and skin strains, 
where certain serotypes are often associated with throat or skin infections, respectively {Cunningham, M., 
2000}. GAS have been discovered responsible for streptococcal toxic shock syndrome associated 
necrotizing fasciitis which is recently resurgent in the USA {Cone, L. et aL, 1987; Stevens, D., 1992} and 
has been described as the ''flesh eating" bacterium which invades skin and soft tissues leading to tissue or 
limb destruction. 

Several post-streptococcal sequelae may occur in humans subsequent to infection, such as acute 
rheumatic fever, acute glomerulonephritis and reactive arthritis. Acute rheumatic fever and rheumatic 
heart disease are of these the most serious autoimmune sequelae and have led to disability and death of 
children worldwide. S. pyogenes can also causes severe acute diseases such as scarlet fever and 
necrotizing fasciitis and has been associated with Tourette's syndrome, tics and movement and attention 
disorders. 

Group A streptococci are the most common bacterial cause of sore throat and phar5mgitis and account for 
at least 16% of all office calls in a general medical practice, season dependent {Hope-Simpson, R., 1981}, It 
primarily affects children in school-age between 5 to 15 years of age {Cunningham, M., 2000}. All ages are 
susceptible to spread of the organism under crowded conditions, for example in schools. GAS are not 
considered normal flora though, but pharyngeal carriage of group A streptococci can occur without 
clinical S3niiptoms. 

Group A streptococci can be distinguished by the Lancefield classification scheme of serologic typing 
based on their carbohydrate or classified into M protein serotypes based on a surface protein that can be 
extracted by boiling bacteria with hydrochloric acid. This has led to the identification of more than 80 
serot5rpes, which can also be typed by a molecular approach (emm genes). Certain M protein serotypes of 
S. pyogenes are mainly associated with pharyngitis and rheumatic fever, while others mainly seem to 
cause pyoderma and acute, glomerulonephritis {Cunningham, M., 2000}. 

Also implicated in causing pharyngitis and occasionally toxic shock are group C and G streptococci, 
which must be distinguished after throat culture {Hope-Simpson, R., 1981; Bisno, A. et al., 1987} . 
Currently, streptococcal infections can only be treated by antibiotic therapy. However, 25-30% of those 
treated with antibiotics show recurrent disease and/or shed the organism in mucosal secretions. There is 
at present no preventive treatment (vaccine) available to avoid streptococcal infections. 

Thus, there remains a need for an effective treatment to prevent or ameliorate streptococcal infections. A 
vaccine could not only prevent infections by streptococci, but more specifically prevent or ameliorate 
colonization of host tissues, thereby reducing the incidence of pharyngitis and other suppurative 
infections. Elimination of non-suppurative sequelae such as rheumatic fever, acute glomerulonephritis, 
sepsis, toxic shock and necrotizing fasciitis would be a direct consequence of reducing the incidence of 
acute infection and carriage of the organism. Vaccines capable of showing cross-protection against other 
streptococci would also be useful to prevent or ameliorate infections caused by all other beta-hemolytic 
streptococcal species, namely groups A, B, C and G. 
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A vaccine can contain a whole variety of different antigens. Examples of antigens are whole-killed or 

attenuated organisms, subfractions of these organisms/tissues, proteins, or, in their most simple form, 
peptides. Antigens can also be recognized by the immune system in form of glycosylated proteins or 
peptides and may also be or contain polysaccharides or lipids. Short peptides can be used since for 
example cytotoxic T-cells (CTL) recognize antigens in form of short usually 8-11 amino acids long 
peptides in conjunction with major histocompatibility complex (MHC). B-cells can recognize linear 
epitopes as short as 4-5 amino acids, as well as three-dimensional structures (conformational epitopes). In 
order to obtain sustained, antigen-specific immune responses, adjuvants need to trigger immune 
cascades that involve all cells of the immune S3^stem necessary. Primarily, adjuvants are acting, but are 
not restricted in their mode of action, on so-called antigen presenting cells (APCs). Tliese cells usually 
first encounter the antigen(s) followed by presentation of processed or unmodified antigen to immune 
effector cells. Intermediate cell types may also be involved. Only effector cells with the appropriate 
specificity are activated in a productive immune response. The adjuvant may also locally retain antigens 
and co-injected other factors. In addition the adjuvant may act as a chemoattractant for other immune 
cells or may act locally and/or systemically as a stimulating agent for the immxme system. 

Approaches to develop a group A streptococcal vaccine have focused mainly on the cell surface M 
protein of S. pyogenes {Bessen, D. et al., 1988; Bronze, M. et al., 1988}. Since more than 80 different M 
serotypes of S. pyogenes exist and new serotypes continually arise {Fischetti, V., 1989}, inoculation with a 
limited number of serotype-specific M protein or M protein derived peptides will not likely be effective in 
protecting against all other M serot3q?es. Furthermore, it has been shown that the M protein contains an 
amino acid sequence, which is immunologically cross-reactive with human heart tissue, which is thought 
to account for heart valve damage associated with rheumatic fever {Fenderson, P. et al., 1989}. 

There are other proteins under consideration for vaccine development, such as the erythrogenic toxins, 
streptococcal pyrogenic exotoxin A and streptococcal pyrogenic exotoxin B {Lee, P. K., 1989}. Immunity to 
these toxins could possibly prevent the deadly symptoms of streptococcal toxic shock, but it may not 
prevent colonization by group A streptococci. 

The use of the above described proteins as antigens for a potential vaccine as well as a number of 
additional candidates {Ji, Y. et al., 1997; Guzman, C. et aL, 1999} resulted mainly from a selection based on 
easiness of identification or chance of availability. There is a demand to identify efficient and relevant 
antigens for S. pyogenes. 

The present inventors have developed a method for identification, isolation and production of 
hyperimmune serum reactive antigens from a specific pathogen, especially from Staphylococcus aureus 
and Staphylococcus epidemtidis (WO 02/059148). However, given the differences in biological property, 
pathogenic function and genetic background. Streptococcus pyogenes is distinctive from Staphylococcus 
strains. Importantly, the selection of sera for the identification of antigens from S. pyogenes is different 
from that applied to the S. aureus screens. Three major types of human sera were collected for that 
purpose. First, healthy adults below <45 years of age preferably with small children in the household 
were tested for nasopharyngeal carriage of S. pyogenes, A large percentage of young children are carriers 
of S. pyogenes, and they are considered a source for exposure for their family members. Based on 
correlative data, protective (colonization neutralizing) antibodies are likely to be present in exposed 
individuals (children with high carriage rate in the household) who are not carriers of S, pyogenes. To be 
able to select for relevant serum sources, a series of ELISAs measuring anti-S. pyogenes IgG and IgA 
antibody levels were performed with bacterial lysates and culture supernatant proteins. Sera from high 
titer non-carriers were included in the genomic based antigen identification. This approach for selection 
of human sera is basically very different from that used for S. aureus, where carriage or noncarriage state 
cannot be associated with antibody levels. 
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Second, serum samples from patients with pharyngitis were characterized and selected in the same way. 
The third group of serum samples obtained from individuals with post-streptococcal sequellae - such as 
acute rheumatic fever and glomerulonephritis - were used mainly for validation purposes. This latter 
group helps in the exclusion of epitopes, which induce high levels of antibodies in these patients, since 
post-streptococcal disease is associated with antibodies induced by GAS and reactive against human 
tissues, such as heart muscle, or involved in harmful immune complex formation in the kidney glomeruli. 
Tlie genomes of the two bacterial species S. pyogenes and S. aureus by itself show a number of important 
differences. The genome of S, pyogeftes contains app. 1.85 Mb, while S. aureus harbours 2.85 Mb. They 
have an average GC content of 38.5 and 33%, respectively and approximately 30 to 45% of the encoded 
genes are not shared between the two pathogens. In addition, the two bacterial species require different 
growth conditions and media for propagation. While S. pyogenes is a strictly human pathogen, S. aureus 
can also be found infecting a range of warm-blooded animals, A list of the most important diseases, 
which can be inflicted by the two pathogens is presented below. S. aureus causes mainly nosocomial, 
opportimistic infections: impetigo, folliculitis, abscesses, boils, infected lacerations, endocarditis, 
meningitis, septic arthritis, pneumonia, osteomyelitis, scalded skin syndrome (SSS), toxic shock 
syndrome. S. pyogenes causes mainly community aquired infections: streptococcal sore tliroat (fever, 
exudative tonsillitis, pharyngitis), streptococcal skin infections, scarlet fever, puerperal fever, septicemia, 
erysipelas, perianal cellulitis, mastoiditis, otitis media, pneumonia, peritonitis, wound infections, acute 
glomerulonephritis, acute rheumatic fever; toxic shock-like syndrome, necrotizing fasciitis. 

The problem underlying the present invention was to provide means for the development of 
medicaments such as vaccines against S. pyogenes infection. More particularly, the problem was to 
provide an efficient, relevant and comprehensive set of nucleic acid molecules or hyperimmune serum 
reactive antigens from S. pyogenes that can be used for the manufacture of said medicaments. 

Therefore, the present invention provides an isolated nucleic acid molecule encoding a hyperimmune 
serum reactive antigen or a fragment thereof comprising a nucleic acid sequence which is selected from 
the group consisting of: 

a) a nucleic acid molecule having at least 70% sequence identity to a nucleic acid molecule selected 
from Seq ID No 1, 4-8, 10-18, 20, 22, 24-32, 34-35, 38-40, 43-46, 49-51, 53-54, 57-61, 63, 65-71, 73, 
75-77, 81-82, 88, 91-94 and 96-150. 

b) a nucleic acid molecule which is complementary to the nucleic acid molecule of a), 

c) a nucleic acid molecule comprising at least 15 sequential bases of the nucleic acid molecule of a) 
orb) 

d) a nucleic acid molecule which anneals under stringent hybridisation conditions to the nucleic 
acid molecule of a), b), or c) 

e) a nucleic acid molecule which, but for the degeneracy of the genetic code, would hybridise to the 
nucleic acid molecule defined in a), b), c) or d). 

According to a preferred embodiment of the present invention the sequence identity is at least 80%, 
preferably at least 95%, especially 100%. 

Furthermore, the present invention provides an isolated nucleic acid molecule encoding a hyperimmune 
serum reactive antigen or a fragment thereof comprising a nucleic acid sequence selected from the group 
consisting of 

a) a nucleic acid molecule having at least 96% sequence identity to a nucleic acid molecule selected 
from Seq ID Mo 64, 

b) a nucleic acid molecule which is complementary to the nucleic acid molecule of a), 

c) a nucleic acid molecule comprising at least 15 sequential bases of the nucleic acid molecule of a) 
or b) 

d) a nucleic acid molecule which anneals under stringent hybridisation conditions to the nucleic 
acid molecule of a), b) or c). 
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e) a nucleic acid molecule which, but for the degeneracy of the genetic code, would hybridise to the 
nucleic acid defined in a), b), c) or d). 

According to another aspect, the present invention provides an isolated nucleic acid molecule comprising 

a nucleic acid sequence selected from the group consisting of 

a) a nucleic acid molecule selected from Seq ID No 3, 36, 47-48, 55, 62, 72, 80, 84, 95. 

b) a nucleic acid molecule which is complementary to the nucleic acid of a), 

c) a nucleic acid molecule which, but for the degeneracy of the genetic code, would hybridise to the 
nucleic acid defined in a), b), c) or d). 

Preferably, the nucleic acid molecule is DNA or RNA. 

According to a preferred embodiment of the present invention, the nucleic acid molecule is isolated from 
a genomic DNA, especially from a S. pyogenes genomic DNA. 

According to the present invention a vector comprising a nucleic acid molecule according to any of the 
present invention is provided. 

In a preferred embodiment the vector is adapted for recombinant expression of the hyperimmune serum 
reactive antigens or fragments thereof encoded by the nucleic acid molecule according to the present 
invention. 

The present invention also provides a host cell comprising the vector according to the present invention. 

According to another aspect the present invention further provides a hyperimmune serum-reactive 
antigen comprising an amino acid sequence being encoded by a nucleic acid molecule according to the 
present invention. 

In a preferred embodiment the amino acid sequence (polypeptide) is selected from the group consisting 
of Seq ID No 151, 154-158, 160-168, 170, 172, 174-182, 184-185, 188-190, 193-196, 199-201, 203-204, 207-211, 
213, 215-221, 223, 225-227, 231-232, 238, 241-244 and 246-300. 

In another preferred embodiment the amino acid sequence (polypeptide) is selected from the group 
consisting of SEq ID No 214 

In a further preferred embodiment the amino acid sequence (polypeptide) is selected from the group 
consisting of Seq ID No 153, 186, 197-198, 205, 212, 222, 230, 234, 245. 

According to a further aspect the present invention provides fragments of hyperimmune serum-reactive 
antigens selected from the group consisting of peptides comprising amino acid sequences of column 
"predicted immunogenic aa" and "location of identified immunogenic region" of Table 1; the serum 
reactive epitopes of Table 2, especially peptides comprising amino acids 4-44, 57-65, 67-98, 101-107, 109- 
125, 131-144, 146-159, 168-173, 181-186, 191-200, 206-213, 229-245, 261-269, 288-301, 304-317, 323-328, 350- 
361, 374-384, 388-407, 416-425 and 1-114 of Seq ID No 151; 5-17, 49-64, 77-82, 87-98, 118-125, 127-140, 142- 
150, 153-159, 191-207, 212-218, 226-270, 274-287, 297-306, 325-331, 340-347, 352-369, 377-382, 390-395 and 
29-226 of Seq ID Mo 152; 4-16, 20-26, 32-74, JS-'S?, 93-108, 116-141, 148-162, 165-180, 206-219, 221-228, 230- 
236, 239-245, 257-268, 313-328, 330-335, 353-359, 367-375, 394-403, 414-434, 437-444, 446-453, 456-464, 478- 
487, 526-535, 541-552, 568-575, 577-584:, 589-598, 610-618, 624-643, 653-665, 667-681, 697-718, 730-748, 755- 
761, 773-794, 806-821, 823-831, 837-845, 862-877, 879-889, 896-919, 924-930, 935-940, 947-955, 959-964, 969- 
986, 991-1002, 1012-1036, 1047-1056, 1067-1073, 1079-1085, 1088-1111, 1130-1135, 1148-1164, 1166-1173, 
1185-1192, 1244-1254 and 919-929 of Seq ID No 153; 5-44, 62-74, 78-83, 99-105, 107-113, 124-134, 161-174, 
176-194, 203-211, 216-237, 241-247, 253-266, 272-299, 323-349, 353-360 and 145-305 of Seq ID No 154; 15-39, 
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52-61, 72-81, 92-97 and 71-81 of Seq ID No 155; 13-19, 21-31, 40-108, 115-122, 125-140, 158-180, 187-203, 
210-223, 235-245 and 173-186 of Seq ID No 156; 5-12, 19-27, 29-39, 59-67, 71-78, 80-88, 92-104, 107-124, 129- 
142, 158-168, 185-191, 218-226, 230-243, 256-267, 272-277, 283-291, 307-325, 331-344, 346-352 and 316-331 of 
Seq ID No 157; 6-28, 43-53, 60-76, 93-103 and 21-99 of Seq ID No 158; 10-30, 120-126, 145-151, 159-169, 
174-182, 191-196, 201-206, 214-220, 222-232, 254-272, 292-307, 313-323, 332-353, 361-369, 389-396, 401-415, 
428-439, 465-481, 510-517, 560-568 and 9-264 of Seq ID No 159; 5-29, 39-45, 107-128 and 1-112 of Seq ID 
No 160; 4-38, 42-50, 54-60, 65-71, 91-102 and 21-56 of Seq ID No 161; 4-13, 19-25, 41-51, 54-62, 68-75, 79-89, 
109-122, 130-136, 172-189, 192-198, 217-224, 262-268, 270-276, 281-298, 315-324, 333-342, 353-370, 376-391 
and 23-39 of Seq ID No 162; 6-41, 49-58, 62-103, 117-124, 147-166, 173-194, 204-211, 221-229, 255-261, 269- 
284, 288-310, 319-325, 348-380, 383-389, 402-410, 424-443, 467-479, 496-517, 535-553, 555-565, 574-581, 583- 
591 and 474-489 of Seq ID No 163; 8-35, 52-57, 66-73, 81-88, 108-114, 125-131, 160-167, 174-180, 230-235, 
237-249, 254-262, 278-285, 308-314, 321-326, 344-353, 358-372, 376-383, 393-411, 439-446, 453-464, 471-480, 
485-492, 502-508, 523-529, 533-556, 558-563, 567-584, 589-597, 605-619, 625-645, 647-666, 671-678, 690-714, 
721-728, 741-763, 766-773, 777-7S7, 792-802, 809-823, 849-864 and 37-241, 409-534, 582-604, 743-804 of Seq 
ID No 164; 4-17, 24-36, 38-44, 59-67, 72-90, 92-121, 126-149, 151-159, 161-175, 197-215, 217-227, 241-247, 
257-264, 266-275, 277-284, 293-307, 315-321, 330-337, 345-350, 357-366, 385-416 and 202-337 of Seq ID No 
165; 4-20, 22-46, 49-70, 80-89, 96-103, 105-119, 123-129, 153-160, 181-223, 227-233, 236-243, 248-255, 261-269, 
274-279, 283-299, 305-313, 315-332, 339-344, 349-362, 365-373, 380-388, 391-397, 402-407 and 1-48 of Seq ID 
No 166; 18-37, 41-63, 100-106, 109-151, 153-167, 170-197, 199-207, 212-229, 232-253, 273-297 and 203-217 of 
Seq ID No 167; 20-26, 54-61, 80-88, 94-101, 113-119, 128-136, 138-144, 156-188, 193-201, 209-217, 221-229, 
239-244, 251-257, 270-278, 281-290, 308-315, 319-332, 339-352, 370-381, 388-400, 411-417, 426-435, 468-482, 
488-497, 499-506, 512-521 and 261-273 of Seq ID No 168; 6-12, 16-36, 50-56, 86-92, 115-125, 143-152, 163- 
172, 193-203, 235-244, 280-289, 302-315, 325-348, 370-379, 399-405, 411-417, 419-429, 441-449, 463-472, 482- 
490, 500-516, 536-543, 561-569, 587-594, 620-636, 647-653, 659-664, 677-685, 687-693, 713-719, 733-740, 746- 
754, 756-779, 792-799, 808-817, 822-828, 851-865, 902-908, 920-938, 946-952, 969-976, 988-1005, 1018-1027, 
1045-1057, 1063-1069, 1071-1078, 1090-1099, 1101-1109, 1113-1127, 1130-1137, 1162-1174, 1211-1221, 1234- 
1242, 1261-1268, 1278-1284, 1312-1317, 1319-1326, 1345-1353, 1366-1378, 1382-1394, 1396-1413, 1415-1424, 
1442-1457, 1467-1474, 1482-1490, 1492-1530, 1537-1549, 1559-1576, 1611-1616, 1624-1641 and 1-414, 443-614, 
997-1392 of Seq ID No 169; 14-42, 70-75, 90-100, 158-181 and 1-164 of Seq ID No 170; 4-21, 30-36, 54-82, 
89-97, 105-118, 138-147 and 126-207 of Seq ID No 171; 4-21, 31-66, 96-104, 106-113, 131-142 and 180-204 of 
Seq ID No 172; 5-23, 31-36, 38-55, 65-74, 79-88, 101-129, 131-154, 156-165, 183-194, 225-237, 245-261, 264- 
271, 279-284, 287-297, 313-319, 327-336, 343-363, 380-386 and 11-197, 204-219, 258-372 of Seq ID No 173; 4- 
20, 34-41, 71-86, 100-110, 113-124, 133-143, 150-158, 160-166, 175-182, 191-197, 213-223, 233-239, 259-278, 
298-322 and 195-289 of Seq ID No 174; 4-10, 21-35, 44-52, 54-62, 67-73, 87-103, 106-135, 161-174, 177-192, 
200-209, 216-223, 249-298, 304-312, 315-329 and 12-130 of Seq ID No 175; 10-27, 33-38, 48-55, 70-76, 96-107, 
119-133, 141-147, 151-165, 183-190, 197-210, 228-236, 245-250, 266-272, 289-295, 297-306, 308-315, 323-352, 
357-371, 381-390, 394-401, 404-415, 417-425, 427-462, 466-483, 485-496, 502-507, 520-529, 531-541, 553-570, 
577-588, 591-596, 600-610, 619-632, 642-665, 671-692, 694-707 and 434-444 of Seq ID No 176; 6-14, 16-25, 
36-46, 52-70, 83-111, 129-138, 140-149, 153-166, 169-181, 188-206, 212-220, 223-259, 261-269, 274-282, 286- 
293, 297-306, 313-319, 329-341, 343-359, 377-390, 409-415, 425-430 and 360-375 of Seq ID No 177; 4-26, 28- 
48, 54-62, 88-121, 147-162, 164-201, 203-237, 245-251 and 254-260 of Seq ID No 178; 12-21, 26-32, 66-72, 87- 
93, 98-112, 125-149, 179-203, 209-226, 233-242, 249-261, 266-271, 273-289, 293-318, 346-354, 360-371, 391-400 
and 369-382 of Seq ID No 179; 11-38, 44-65, 70-87, 129-135, 140-163, 171-177, 225-232, 238-249, 258-266, 
271-280, 284-291, 295-300, 329-337, 344-352, 405-412, 416-424, 426-434, 436-455, 462-475, 478-487 and 270- 
312 of Seq ID No 180; 5-17, 34-45, 59-69, 82-88, 117-129, 137-142, 158-165, 180-195, 201-206, 219-226, 241- 
260, 269-279, 292-305, 312-321, 341-347, 362-381, 396-410, 413-432, 434-445, 447-453, 482-487, 492-499, 507- 
516, 546-552, 556-565, 587-604 and 486-598 of Seq ID No 181; 4-15, 17-32, 40-47, 67-78, 90-98, 101-107, 111- 
136, 161-171, 184-198, 208-214, 234-245, 247-254, 272-279, 288-298, 303-310, 315-320, 327-333, 338-349, 364- 
374 and 378-396 of Seq ID No 182; 5-27, 33-49, 51-57, 74-81, 95-107, 130-137, 148-157, 173-184 and 75-235 
of Seq ID No 183; 6-23, 47-53, 57-63, 75-82, 97-105, 113-122, 124-134, 142-153, 159-164, 169-179, 181-187, 
192-208, 215-243, 247-257, 285-290, 303-310 and 30-51 of Seq ID No 184; 17-29, 44-52, 59-73, 77-83, 86-92, 
97-110, 118-153, 156-166, 173-179, 192-209, 225-231, 234-240, 245-251, 260-268, 274-279, 297-306, 328-340, 
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353-360, 369-382, 384-397, 414-423, 431-436, 452-465, 492-498, 500-508, 516-552, 554-560, 568-574, 580-586, 
609-617, 620-626, 641-647 and 208-219 of Seq ID No 185; 4-26, 32-45, 58-72, 111-119, 137-143, 146-159, 187- 
193, 221-231, 235-242, 250-273, 290-304, 311-321, 326-339, 341-347, 354-368, 397-403, 412-419, 426-432, 487- 
506, 580-592, 619-628, 663-685, 707-716, 743-751, 770-776, 787-792, 850-859, 866-873, 882-888, 922-931, 957- 
963, 975-981, 983-989, 1000-1008, 1023-1029, 1058-1064, 1089-1099, 1107-1114, 1139-1145, 1147-1156, 1217- 
1226, 1276-1281, 1329-1335, 1355-1366, 1382-1394, 1410-1416, 1418-1424, 1443-1451, 1461-1469, 1483-1489, 
1491-1501, 1515-1522, 1538-1544, 1549-1561, 1587-1593, 1603-1613, 1625-1630, 1636-1641, 1684-1690, 1706- 
1723, 1765-1771, 1787-1804, 1850-1857, 1863-1894, 1897-1910, 1926-1935, 1937-1943, 1960-1983, 1991-2005, 
2008-2014, 2018-2039 and 396-533, 1342-1502, 1672-1920 of Seq ID Mo 186; 4-25, 45-50, 53-65, 79-85, 87-92, 
99-109, 126-137, 141-148, 156-183, 190-203, 212-217, 221-228, 235-242, 247-277, 287-293, 300-319, 321-330, 
341-361, 378-389, 394-406, 437-449, 455-461, 4::^-478, 482-491, 507-522, 544-554, 576-582, 587-593, 611-621, 
626-632, 649-661, 679-685, 696-704, 706-716, 726-736, 740-751, 759-766, 786-792, 797-802, 810-822, 824-832, 
843-852, 863-869, 874-879, 882-905 and 1-113, 210-232, 250-423, 536-564 of Seq ID No 187; 4-16, 33-39, 43- 
49, 54-85, 107-123, 131-147, 157-169, 177-187, 198-209, 220-230, 238-248, 277-286, 293-301, 303-315, 319-379, 
383-393, 402-414, 426-432, 439-449, 470-478, 483-497, 502-535, 552-566, 571-582, 596-601, 608-620, 631-643, 
651-656, 663-678, 680-699, 705-717, 724-732, 738-748, 756-763, 766-772, 776-791, 796-810, 819-827, 829-841, 
847-861, 866-871, 876-882, 887-894, 909-934, 941-947, 957-969, 986-994, 998-1028, 1033-1070, 1073-1080, 
1090-1096, 1098-1132, 1134-1159, 1164-1172, 1174-1201 and 617-635 of Seq ID No 188; 7-25, 30-40, 42-64, 
70-77, 85-118, 120-166, 169-199, 202-213, 222-244 and 190-203 of Seq ID No 189; 4-11, 15-53, 55-93, 95-113, 

120- 159, 164-200, 210-243, 250-258, 261-283, 298-319, 327-340, 356-366, 369-376, 380-386, 394-406, 409-421, 
425-435, 442-454, 461-472, 480-490, 494-505, 507-514, 521-527, 533-544, 566-574 and 385-398 of Seq ID No 
190; 5-36, 66-72, 120-127, 146-152, 159-168, 172-184, 205-210, 221-232, 234-243, 251-275, 295-305, 325-332, 
367-373, 470-479, 482-487, 520-548, 592-600, 605-615, 627-642, 655-662, 664-698, 718-725, 734-763, 776-784, 
798-809, 811-842, 845-852, 867-872, 879-888, 900-928, 933-940, 972-977, 982-1003 and 12-190, 276-283, 666- 
806 of Seq ID No 191; 4-38, 63-68, 100-114, 160-173, 183-192, 195-210, 212-219, 221-238, 240-256, 258-266, 
274-290, 301-311, 313-319, 332-341, 357-363, 395-401, 405-410, 420-426, 435-450, 453-461, 468-475, 491-498, 
510-518, 529-537, 545-552, 585-592, 602-611, 634-639, 650-664 and 30-80, 89-105, 111-151 of Seq ID No 192; 
7-29, 31-39, 47-54, 63-74, 81-94, 97-117, 122-127, 146-157, 168-192, 195-204, 216-240, 251-259 and 195-203 of 
Seq ID No 193; 5-16, 28-34, 46-65, 79-94, 98-105, 107-113, 120-134, 147-158, 163-172, 180-186, 226-233, 237- 
251, 253-259, 275-285, 287-294, 302-308, 315-321, 334-344, 360-371, 399-412, 420-426 and 32-50 of Seq ID No 
194; 8-20, 30-36, 71-79, 90-96, 106-117, 125-138, 141-147, 166-174 and 75-90 of Seq ID No 195; 4-13, 15-33, 
43-52, 63-85, 98-114, 131-139, 146-174, 186-192, 198-206, 227-233 and 69-88 of Seq ID No 196; 4-22, 29-35, 
59-68, 153-170, 213-219, 224-238, 240-246, 263-270, 285-292, 301-321, 327-346, 356-371, 389-405, 411-418, 
421-427, 430-437, 450-467, 472-477, 482-487, 513-518, 531-538, 569-576, 606-614, 637-657, 662-667, 673-690, 
743-753, 760-767, T7Q-777, 786-802 and 96-230, 361-491, 572-585 of Seq ID No 197; 4-12, 21-36, 48-55, 74-82, 

121- 127, 195-203, 207-228, 247-262, 269-278, 280-289 and 102-210 of Seq ID No 198; 13-20, 23-31, 38-44, 78- 
107, 110-118, 122-144, 151-164, 176-182, 190-198, 209-216, 219-243, 251-256, 289-304, 306-313 and 240-248 of 
Seq ID No 199; 5-26, 34-48, 57-77, 84-102, 116-132, 139-145, 150-162, 165-173, 176-187, 192-205, 216-221, 
234-248, 250-260 and 182-198 of Seq ID No 200; 10-19, 26-44, 53-62, 69-87, 90-96, 121-127, 141-146, 148-158, 
175-193, 204-259, 307-313, 334-348, 360-365, 370-401, 411-439, 441-450, 455-462, 467-472, 488-504 and 41-56 
of Seq ID No 201; 5-21, 36-42, 96-116, 123-130, 138-144, 146-157, 184-201, 213-228, 252-259, 277-297, 308- 
313, 318-323, 327-333 and 202-217 of Seq ID No 202; 6-26, 33-51, 72-90, 97-131, 147-154, 164-171, 187-216, 
231-236, 260-269, 275-283 and 1-127 of Seq ID No 203; 4-22, 24-38, 44-58, 72-88, 99-108, 110-117, 123-129, 
131-137, 142-147, 167-178, 181-190, 206-214, 217-223, 271-282, 290-305, 320-327, 329-336, 343-352, 354-364, 
396-402, 425-434, 451-456, 471-477, 485-491, 515-541, 544-583, 595-609, 611-626, 644-656, 660-681, 683-691, 
695-718 and 297-458 of Seq ID No 204; 5-43, 92-102, 107-116, 120-130, 137-144, 155-163, 169-174, 193-213 
and 24-135 of Seq ID No 205; 4-25, 61-69, 73-85, 88-95, 97-109, 111-130, 135-147, 150-157, 159-179, 182-201, 
206-212, 224-248, 253-260, 287-295, 314-331, 338-344, 365-376, 396-405, 413-422, 424-430, 432-449, 478-485, 
487-494, 503-517, 522-536, 544-560, 564-578, 585-590, 597-613, 615-623, 629-636, 640-649, 662-671, 713-721 
and 176-330 of Seq ID No 206; 31-37, 41-52, 58-79, 82-105, 133-179, 184-193, 199-205, 209-226, 256-277, 281- 
295, 297-314, 322-328, 331-337, 359-367, 379-395, 403-409, 417-432, 442-447, 451-460, 466-472 and 46-62, 296- 
341 of Seq ID No 207; 23-29, 56-63, 67-74, 96-108, 122-132, 139-146, 152-159, 167-178, 189-196, 214-231, 247- 
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265, 274-293, 301-309, 326-332, 356-363, 378-395, 406-412, 436-442, 445-451, 465-479, 487-501, 528-555, 567- 
581, 583-599, 610-617, 622-629, 638-662, 681-686, 694-700, 711-716 and 667-684 of Seq ID No 208; 20-51, 53- 
59, 109-115, 140-154, 185-191, 201-209, 212-218, 234-243, 253-263, 277-290, 303-313, 327-337, 342-349, 374- 
382, 394-410, 436-442, 464-477, 486-499, 521-530, 536-550, 560-566, 569-583, 652-672, 680-686, 698-704, 718- 
746, 758-770, 774-788, 802-827, 835-842, 861-869 and 258-416 of Seq ID No 209; 7-25, 39-45, 59-70, 92-108, 

116- 127, 161-168, 202-211, 217-227, 229-239, 254-262, 271-278, 291-300 and 278-295 of Seq ID Mo 210; 4-20, 
27-33, 45-51, 53-62, 66-74, 81-88, 98-111, 124-130, 136-144, 156-179, 183-191 and 183-195 of Seq ID Mo 211; 
12-24, 27-33, 43-49, 55-71, 77-85, 122-131, 168-177, 179-203, 209-214, 226-241 and 63-238 of Seq ID Mo 212; 
4-19, 37-50, 120-126, 131-137, 139-162, 177-195, 200-209, 211-218, 233-256, 260-268, 271-283, 288-308 and 1- 
141 of Seq ID No 213; 11-17, 40-47, 57-63, 96-124, 141-162, 170-207, 223-235, 241-265, 271-277, 281-300, 312- 
318, 327-333, 373-379 and 231-368 of Seq ID No 214; 9-33, 41-48, 57-79, 97-103, 113-138, 146-157, 165-186, 
195-201, 209-215, 223-229, 237-247, 277-286, 290-297, 328-342 and 247-260 of Seq ID No 215; 7-15, 39-45, 
58-64, 79-84, 97-127, 130-141, 163-176, 195-203, 216-225, 235-247, 254-264, 271-279 and 64-72 of Seq ID No 
216; 4-12, 26-42, 46-65, 73-80, 82-94, 116-125, 135-146, 167-173, 183-190, 232-271, 274-282, 300-306, 320-343, 
351-362, 373-383, 385-391, 402-409, 414-426, 434-455, 460-466, 473-481, 485-503, 519-525, 533-542, 554-565, 
599-624, 645-651, 675-693, 717-725, 751-758, 767-785, 792-797, 801-809, 819-825, 831-836, 859-869, 890-897 
and 222-362, 756-896 of Seq ID No 217; 11-17, 22-28, 52-69, 73-83, 86-97, 123-148, 150-164, 166-177, 179- 
186, 188-199, 219-225, 229-243, 250-255 and 153-170 of Seq ID No 218; 4-61, 71-80, 83-90, 92-128, 133-153, 
167-182, 184-192, 198-212 and 56-73 of Seq ID No 219; 4-19, 26-37, 45-52, 58-66, 71-77, 84-92, 94-101, 107- 
118, 120-133, 156-168, 170-179, 208-216, 228-238, 253-273, 280-296, 303-317, 326-334 and 298-312 of Seq ID 
No 220; 7-13, 27-35, 38-56, 85-108, 113-121, 123-160, 163-169, 172-183, 188-200, 206-211, 219-238, 247-254 
and 141-157 of Seq ID No 221; 23-39, 45-73, 86-103, 107-115, 125-132, 137-146, 148-158, 160-168, 172-179, 
185-192, 200-207, 210-224, 233-239, 246-255, 285-334, 338-352, 355-379, 383-389, 408-417, 423-429, 446-456, 
460-473, 478-503, 522-540, 553-562, 568-577, 596-602, 620-636, 640-649, 655-663 and 433-440, 572-593 of Seq 
ID No 222; 4-42, 46-58, 64-76, 118-124, 130-137, 148-156, 164-169, 175-182, 187-194, 203-218, 220-227, 241- 
246, 254-259, 264-270, 275-289, 296-305, 309-314, 322-334, 342-354, 398-405, 419-426, 432-443, 462-475, 522- 
530, 552-567, 593-607, 618-634, 636-647, 653-658, 662-670, 681-695, 698-707, 709-720, 732-742, 767-792, 794- 
822, 828-842, 851-866, 881-890, 895-903, 928-934, 940-963, 978-986, 1003-1025, 1027-1043, 1058-1075, 1080- 
1087, 1095-1109, 1116-1122, 1133-1138, 1168-1174, 1179-1186, 1207-1214, 1248-1267 and 17-319, 417-563 of 
Seq ID No 223; 6-19, 23-33, 129-138, 140-150, 153-184, 190-198, 206-219, 235-245, 267-275, 284-289, 303-310, 
322-328, 354-404, 407-413, 423-446, 453-462, 467-481, 491-500 and 46-187 of Seq ID No 224; 4-34, 39-57, 78- 
86, 106-116, 141-151, 156-162, 165-172, 213-237, 252-260, 262-268, 272-279, 296-307, 332-338, 397-403, 406- 
416, 431-446, 448-453, 464-470, 503-515, 519-525, 534-540, 551-563, 578-593, 646-668, 693-699, 703-719, 738- 
744, 748-759, 77\''777, 807-813, 840-847, 87^-876, 897-903, 910-925, 967-976, 979-992 and 21-244, 381-499, 
818-959 of Seq ID No 225; 19-29, 65-75, 90-109, 111-137, 155-165, 169-175 and 118-136 of Seq ID No 226; 
15-20, 30-36, 55-63, 73-79, 90-117, 120-127, 136-149, 166-188, 195-203, 211-223, 242-255, 264-269, 281-287, 
325-330, 334-341, 348-366, 395-408, 423-429, 436-444, 452-465 and 147-155 of Seq ID No 227; 11-18, 21-53, 
77-83, 91-98, 109-119, 142-163, 173-181, 193-208, 216-227, 238-255, 261-268, 274-286, 290-297, 308-315, 326- 
332, 352-359, 377-395, 399-406, 418-426, 428-438, 442-448, 458-465, 473-482, 488-499, 514-524, 543-553, 564- 
600, 623-632, 647-654, 660-669, 672-678, 710-723, 739-749, 787-793, 820-828, 838-860, 889-895, 901-907, 924- 
939, 956-962, 969-976, 991-999, 1012-1018, 1024-1029, 1035-1072, 1078-1091, 1142-1161 and 74-438 of Seq ID 
No 228; 4-31, 41-52, 58-63, 65-73, 83-88, 102-117, 123-130, 150-172, 177-195, 207-217, 222-235, 247-253, 295- 
305, 315-328, 335-342, 359-365, 389-394, 404-413 and 156-420 of Seq ID No 229; 4-42, 56-69, 98-108, 120-125, 
210-216, 225-231, 276-285, 304-310, 313-318, 322-343 and 79-348 of Seq ID No 230; 12-21, 24-30, 42-50, 61- 
67, 69-85, 90-97, 110-143, 155-168 and 53-70 of Seq ID No 231; 4--26, 41-54, 71-78, 88-96, 116-127, 140-149, 
151-158, 161-175, 190-196, 201-208, 220-226, 240-247, 266-281, 298-305, 308-318, 321-329, 344-353, 370-378, 
384-405, 418-426, 429-442, 457-463, 494-505, 514-522 and 183-341 of Seq ID Mo 232; 4-27, 69-77, 79-101, 

117- 123, 126-142, 155-161, 171-186, 200-206, 213-231, 233-244, 258-263, 269-275, 315-331, 337-346, 349-372, 
376-381, 401-410, 424-445, 447-455, 463-470, 478-484, 520-536, 546-555, 558-569, 580-597, 603-618, 628-638, 
648-660, 668-683, 717-723, 765-771, 781-788, 792-806, 812-822 and 92-231, 618-757 of Seq ID No 233; 11-47, 
63-75, 108-117, 119-128, 133-143, 171-185, 190-196, 226-232, 257-264, 278-283, 297-309, 332-338, 341-346, 
351-358, 362-372 and 41-170 of Seq ID No 234; 6-26, 50-56, 83-89, 108-114, 123-131, 172-181, 194-200, 221- 
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238, 241-259, 263-271, 284-292, 304-319, 321-335, 353-358, 384-391, 408-417, 424-430, 442-448, 459-466, 487- 
500, 514-528, 541-556, 572-578, 595-601, 605-613, 620-631, 634-648, 660-679, 686-693, 702-708, 716-725, 730- 
735, 749-755, 770-777, 805-811, 831-837, 843-851, 854-860, 863-869, 895-901, 904-914, 922-929, 933-938, 947- 
952, 956-963, 1000-1005, 1008-1014, 1021-1030, 1131-1137, 1154-1164, 1166-1174 and 20-487, 757-1153 of 
Seq ID No 235; 10-34, 67''78, 131-146, 160-175, 189-194, 201-214, 239-250, 265-271, 296-305 and 26-74, 91- 
100, 105-303 of Seq ID No 236; 9-15, 19-32, 109-122, 143-150, 171-180, 186-191, 209-217, 223-229, 260-273, 
302-315, 340-346, 353-359, 377-383, 389-406, 420-426, 460-480 and 10-223, 231-251, 264-297, 312-336 of Seq 
ID Mo 237; 5-28, 76-81, 180-195, 203-209, 211-219, 227-234, 242-252, 271-282, 317-325, 350-356, 358-364, 394- 
400, 405-413, 417-424, 430-436, 443-449, 462-482, 488-498, 503-509, 525-537 and 22-344 of Seq ID No 238; 5- 
28, 42-54, 77-83, 86-93, 98-104, 120-127, 145-159, 166-176, 181-187, 189-197, 213-218, 230-237, 263-271, 285- 
291, 299-305, 326-346, 368-375, 390-395 and 1-151 of Seq ID No 239; 6-34, 48-55, 58-64, 84-101, 121-127, 
143-149, 153-159, 163-170, 173-181, 216-225, 227-240, 248-254, 275-290, 349-364, 375-410, 412-418, 432-438, 
445-451, 465-475, 488-496, 505-515, 558-564, 571-579, 585-595, 604-613, 626-643, 652-659, 677-686, 688-696, 
702-709, 731-747, 777-795, 820-828, 836-842, 845-856, 863-868, 874-882, 900-909, 926-943, 961-976, 980-986, 
992-998, 1022-1034, 1044-1074, 1085-1096, 1101-1112, 1117-1123, 1130-1147, 1181-1187, 1204-1211, 1213- 
1223, 1226-1239, 1242-1249, 1265-1271, 1273-1293, 1300-1308, 1361-1367, 1378-1384, 1395-1406, 1420-1428, 
1439-1446, 1454-1460, 1477-1487, 1509-1520, 1526-1536, 1557-1574, 1585-1596, 1605-1617, 1621-1627, 1631- 
1637, 1648-1654, 1675-1689, 1692-1698, 1700-1706, 1712-1719, 1743-1756 and 91-263 of Seq ID No 240; 4-16, 
75-90, 101-136, 138-144, 158-164, 171-177, 191-201, 214-222, 231-241, 284-290, 297-305, 311-321, 330-339, 
352-369, 378-385, 403-412, 414-422, 428-435, 457-473, 503-521, 546-554, 562-568, 571-582, 589-594, 600-608, 
626-635, 652-669, 687-702, 706-712, 718-724, 748-760, 770-775 and 261-272 of Seq ID No 241; 4-19, 30-41, 
46-57, 62-68, 75-92, 126-132, 149-156, 158-168, 171-184, 187-194, 210-216, 218-238, 245-253, 306-312, 323-329, 
340-351, 365-373, 384-391, 399-405, 422-432, 454-465, 471-481, 502-519, 530-541, 550-562, 566-572, 576-582, 
593-599, 620-634, 637-643, 645-651, 657-664, 688-701 and 541-551 of Seq ID No 242; 6-11, 17-25, 53-58, 80- 
86, 91-99, 101-113, 123-131, 162-169, 181-188, 199-231, 245-252 and 84-254 of Seq ID No 243; 13-30, 71-120, 
125-137, 139-145, 184-199 and 61-78 of Seq ID No 244; 9-30, 38-53, 63-70, 74-97, 103-150, 158-175, 183-217, 
225-253, 260-268, 272-286, 290-341, 352-428, 434-450, 453-460, 469-478, 513-525, 527-534, 554-563, 586-600, 
602-610, 624-640, 656-684, 707-729, 735-749, 757-763, 766-772, 779-788, 799-805, 807-815, 819-826, 831-855 
and 568-580 of Seq ID No 245; 11-21, 29-38 and 5-17 of Seq ID No 246; 2-9 of Seq ID No 247; 4-10, 16-28 
and 7-18, 26-34 of Seq ID No 248; 10-16 and 1-15 of Seq ID No 249; 4-11 of Seq ID No 250; 4-40, 42-51 
and 37-53 of Seq ID No 251; 4-21 and 22-29 of Seq ID No 252; 2-11 Seq ID No 253; 9-17, 32-44 and 1-22 of 
Seq ID No 254; 19-25, 27-32 and 15-34 of Seq ID No 255; 4-12, 15-22 and 11-33 of Seq ID No 256; 10-17, 
24-30, 39-46, 51-70 and 51-61 of Seq ID No 257; 6-19 of Seq ID No 258; 6-11, 21-27, 31-54 and 11-29 of Seq 
ID No 259; 4-10, 13-45 and 11-35 of Seq ID No 260; 4-14, 23-32 and 11-35 of Seq ID No 261; 14-39, 45-51 
and 15-29 of Seq ID No 262; 4-11, 14-28 and 4-17 of Seq ID No 263; 4-16 and 2-16 of Seq ID No 264; 4-10, 
12-19, 39-50 and 6-22 of Seq ID No 265; 2-13 of Seq ID No 266; 4-11, 22-65 and 3-19 of Seq ID No 267; 17- 
23, 30-35, 39-46, 57-62 and 30-49 of Seq ID No 268; 4-19 and 14-22 of Seq ID No 269; 2-9 of Seq ID No 
270; 7-18, 30-43 and 4-12 of Seq ID No 271; 4-30, 39-47 and 5-22 of Seq ID No 272; 6-15 and 14-29 of Seq 
ID No 273; 4-34 and 23-35 of Seq ID No 274; 4-36, 44-57, 65-72 and 14-27 of Seq ID No 275; 4-18 and 11-20 
of Seq ID No 276; 5-19 of Seq ID No 277; 18-36 and 6-20 of Seq ID No 278; 4-10, 19-34, 41-84, 96-104 and 
50-63 of Seq ID No 279; 4-9, 19-27 and 8-21 of Seq ID No 280; 4-16, 18-28 and 22-30 of Seq ID No 281; 4- 
15 and 21-35 of Seq ID No 282; 4-17 and 3-13 of Seq ID No 283; 4-12 and 4-18 of Seq ID No 284; 4-24, 31- 
36 and 29-45 of Seq ID No 285; 12-22, 34-49 and 21-32 of Seq ID No 286; 4-17 and 22-32 of Seq ID No 287; 
4-16, 25-42 and 7-28 of Seq ID No 288; 4-10 and 7-20 of Seq ID No 289; 4-11, 16-36, 39-54 and 28-44 of 
Seq ID No 290; 5-20, 29-54 and 14-29 of Seq ID No 291; 24-33 and 10-22 of Seq ID No 292; 10-51, 54-61 
and 43-64 of Seq ID Mo 293; 7-13 and 2-17 of Seq ID Mo 294; 11-20 and 6-20 of Seq ID Mo 295; 4-30, 34-41 
and 19-28 of Seq ID No 296; 11-21 of Seq ID No 297; 4-16, 21-26 and 9-38 of Seq ID No 298; 4-12, 15-27, 
30-42, 66-72 and 10-24 of Seq ID No 299; 8-17 and 11-20 of Seq ID No 300; and 2-19 of Seq ID No246; 1- 
12 of Seq ID No 247; 21-38 of Seq ID Mo 248; 2-22 of Seq ID No 254; 15-33 of Seq ID No 255; 11-32 of Seq 
ID No 256; 11-28 of Seq ID No 259; 10-27 of Seq ID No 260; 9-26 of Seq ID No 261; 4-16 of Seq ID No 
263; 1-18 of Seq ID No 266; 12-29 of Seq ID No 273; 6-23 of Seq ID No 276; 1-21 of Seq ID No 277; 47-64 
of Seq ID No 279; 28-45 of Seq ID No 285; 18-35 of Seq ID No 287; 14-31 of Seq ID No 291; 7-24 of Seq 
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ID No 292; 8-25 of Seq ID No 299; 1-20 of Seq ID No 300; 18-33 of Seq ID No 151; 62-72 of Seq ID No 
151; 118-131 of Seq ID No 152; 195-220 of Seq ID No 154; 215-240 of Seq ID No 154; 255-280 of Seq ID 
No 154, 72-81 of Seq ID No 155; 174-186 of Seq ID No 156; 317-331 of Seq ID No 157; 35-59 of Seq ID No 
158; 54-84 of Seq ID No 158; 79-104 of Seq ID No 158; 33-58 of Seq ID No 159; 81-101 of Seq ID No 159; 
136-150 of Seq ID No 159; 173-186 of Seq ID No 159; 231-251 of Seq ID No 159; 22-48 of Seq ID Mo 161; 
24-39 of Seq ID Mo 162; 475-489 of Seq ID Mo 163; 38-56 of Seq ID Mo 164; 583-604 of Seq ID Mo 164; 
202-223 of Seq ID Mo 165; 222-247 of Seq ID Mo 165; 242-267 of Seq ID Mo 165; 262-287 of Seq ID Mo 
165; 282-307 of Seq ID Mo 165; 302-327 of Seq ID Mo 165; 25-48 of Seq ID Mo 166; 204-217 of Seq ID Mo 
167; 259-276 of Seq ID Mo 168; 121-139 of Seq ID Mo 169; 260-267 of Seq ID Mo 169; 215-240 of Seq ID 
Mo 169; 115-140 of Seq ID Mo 170; 182-204 of Seq ID Mo 172; 144-153 of Seq ID Mo 173; 205-219 of Seq 
ID No 173; 196-206 of Seq ID No 174; 240-249 of Seq ID No 174; 272-287 of Seq ID No 174; 199-223 of 
Seq ID Mo 174; 218-237 of Seq ID Mo 174; 226-249 of Seq ID Mo 175; 287-306 of Seq ID Mo 175; 430-449 
of Seq ID Mo 176; 361-375 of Seq ID Mo 177; 241-260 of Seq ID Mo 178; 483-502 of Seq ID Mo 181; 379- 
396 of Seq ID No 182; 31-51 of Seq ID No 184; 1436-1460 of Seq ID No 186; 1455-1474 of Seq ID No 186; 
1469-1487 of Seq ID Mo 186; 215-229 of Seq ID Mo 187; 534-561 of Seq ID Mo 187; 59-84 of Seq ID Mo 
187; 79-104 of Seq ID No 187; 618-635 of Seq ID Mo 188; 191-203 of Seq ID Mo 189; 386-398 of Seq ID Mo 
190; 65-83 of Seq ID No 191; 90-105 of Seq ID No 192; 112-136 of Seq ID No 192; 290-209 of Seq ID No 
193; 33-50 of Seq ID No 194; 76-90 of Seq ID No 195; 70-88 of Seq ID No 196; 418-442 of Seq ID No 197; 
574-585 of Seq ID No 197; 87-104 of Seq ID No 198; 124-148 of Seq ID No 198; 141-152 of Seq ID No 198; 
241-248 of Seq ID No 199; 183-198 of Seq ID No 200; 40-57 of Seq ID No 201; 202-217 of Seq ID No 202; 
50-74 of Seq ID No 203; 69-93 of Seq ID No 203; 88-112 of Seq ID No 203; 107-127 of Seq ID No 203; 74- 
92 of Seq ID No 205; 207-232 of Seq ID No 206; 227-252 of Seq ID No 206; 247-272 of Seq ID No 206; 47- 
60 of Seq ID No 207; 297-305 of Seq ID No 207; 312-337 of Seq ID No 207; 667-384 of Seq ID No 208; 279- 
295 of Seq ID No 210; 179-198 of Seq ID No 211; 27-51 of Seq ID No 213; 46-70 of Seq ID No 213; 65-89 
of Seq ID No 213; 84-108 of Seq ID No 213; 112-141 of Seq ID No 213; 248-260 of Seq ID No 215; 59-78 of 
Seq ID No 216; 154-170 of Seq ID No 218; 57-73 of Seq ID No 219; 297-314 of Seq ID No 220; 142-157 of 
Seq ID No 221; 428-447 of Seq ID No 222; 573-593 of Seq ID No 222; 523-544 of Seq ID No 223; 46-70 of 
Seq ID No 223; 65-89 of Seq ID No 223; 84-108 of Seq ID No 223; 122-151 of Seq ID No 223; 123-142 of 
Seq ID No 224; 903-921 of Seq ID No 225; 119-136 of Seq ID No 226; 142-161 of Seq ID No 227; 258-277 
of Seq ID No 228; 272-300 of Seq ID No 228; 295-322 of Seq ID No 228; 311-343 of Seq ID No 229; 278- 
304 of Seq ID No 229; 131-150 of Seq ID No 230; 195-218 of Seq ID No 230; 53-70 of Seq ID No 231; 184- 
208 of Seq ID No 232; 222-246 of Seq ID No 232; 241-265 of Seq ID No 232; 260-284 of Seq ID No 232; 
279-303 of Seq ID No 232; 317-341 of Seq ID No 232; 678-696 of Seq ID No 233; 88-114 of Seq ID No 235; 
464-481 of Seq ID No 235; 153-172 of Seq ID No 236; 137-155, 166-184 of Seq ID No 236; 215-228 of Seq 
ID No 236; 37-51 of Seq ID No 237; 53-75 of Seq ID No 237; 232-251 of Seq ID No 237; 318-336 of Seq ID 
No 237; 305-315 of Seq ID No 238; 131-156 of Seq ID No 238; 258-275 of Seq ID No 241; 107-137 of Seq 
ID No 243; 138-162 of Seq ID No 243; 157-181 of Seq ID No 243; 195-227 of Seq ID No 243; 62-78 of Seq 
ID No 244; 567-584 of Seq ID No 245. 

The present invention also provides a process for producing a S. pyogenes hyperimmune serum reactive 
antigen or a fragment thereof according to the present invention comprising expressing one or more of 
the nucleic acid molecules according to the present invention in a suitable expression system. 

Moreover, the present invention provides a process for producing a cell, which expresses a S. pyogenes 
hyperimmune serum reactive antigen or a fragment thereof according to the present invention 
comprising transforming or transfecting a suitable host cell with the vector according to the present 
invention. 

According to the present invention a pharmaceutical composition, especially a vaccine, comprising a 
hyperimmune serum-reactive antigen or a fragment thereof as defined in the present invention or a 
nucleic acid molecule as defined in the present invention is provided. 
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In a preferred embodiment the pharmaceutical composition further comprises an immunostimulatory 
substance, preferably selected from the group comprising polycationic polymers, especially polycationic 
peptides, immunostimulatory deoxynucleotides (ODNs), peptides containing at least two LysLeuLys 
motifs, especially klklsbclk, neuroactive compounds, especially human growth hormone, alumn, Freimd's 
complete or incomplete adjuvants or combinations thereof. 

In a more preferred embodiment the immunostimulatory substance is a combination of eitlier a 
polycationic polymer and immunostimulatory deoxjmucleotides or of a peptide containing at least two 
LysLeuLys motifs and immunostimulatory deoxynucleotides. 

In a still more preferred embodiment the polycationic polymer is a polycationic peptide, especially 
polyarginine. 

According to the present invention the use of a nucleic acid molecule according to the present invention 
or a hyperimmune serum-reactive antigen or fragment thereof according to the present invention for the 
manufacture of a pharmaceutical preparation, especially for the manufacture of a vaccine against S, 
pyogenes infection, is provided. 

Also an antibody, or at least an effective part thereof, which binds at least to a selective part of the 
hyperimmune serum-reactive antigen or a fragment thereof according to the present invention is 
provided herewith. 

In a preferred embodiment the antibody is a monoclonal antibody. 

In another preferred embodiment the effective part of the antibody comprises Fab fragments. 
In a further preferred embodiment the antibody is a chimeric antibody. 
In a still preferred embodiment the antibody is a humanized antibody. 

The present invention also provides a hybridoma cell line, which produces an antibody according to the 
present invention. 

Moreover, the present invention provids a method for producing an antibody according to the present 
invention, characterized by the following steps: 

• initiating an immune response in a non-human animal by administrating an hyperimmune 
serum-reactive antigen or a fragment thereof, as defined in the invention, to said animal, 

• removing an antibody containing body fluid from said animal, and 

• producing the antibody by subjecting said antibody containing body fluid to further 
purification steps. 

Accordingly, the present invention also provides a method for producing an antibody according to the 
present invention, characterized by the following steps: 

© initiating an immune response in a non-human animal by administrating an hyperimmune 
serum-reactive antigen or a fragment thereof, as defined in the present invention, to said animal, 

o removing the spleen or spleen cells from said animal, 

© producing hybridoma cells of said spleen or spleen cells, 

o selecting and cloning hybridoma cells specific for said hyperimmune serum-reactive antigens or a 
fragment thereof, 

• producing the antibody by cultivation of said cloned hybridoma cells and optionally further 
purification steps. 
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The antibodies provided or produced according to the above methods may be used for the preparation of 
a medicament for treating or preventing S. pyogmes infections. 

According to another aspect the present invention provides an antagonist which binds to a hyperimmune 
serimi-reactive antigen or a fragment thereof according to the present invention. 

Such an antagonist capable of binding to a hyperimmune serum-reactive antigen or fragment thereof 
according to the present invention may be identified by a method comprising the following steps: 

a) contacting an isolated or immobilized hyperimmune serum-reactive antigen or a fragment 
thereof according to the present invention with a candidate antagonist under conditions to 
permit binding of said candidate antagonist to said h3^erimmune serum-reactive antigen or 
fragment in the presence of a component capable of providing a detectable signal in response to 
the binding of the candidate antagonist to said hyperimmune serum reactive antigen or fragment 
thereof; and 

b) detecting the presence or absence of a signal generated in response to the binding of the 
antagonist to the hyperimmune serum reactive antigen or the fragment thereof. 

An antagonist capable of reducing or inhibiting the interaction activity of a hyperimmune serum-reactive 
antigen or a fragment thereof according to the present invention to its interaction partner may be 
identified by a method comprising the following steps: 

a) providing a hyperimmune serum reactive antigen or a h5^erimmune fragment thereof according 
to the present invention, 

b) providing an interaction partner to said hyperimmune serum reactive antigen or a fragment 
thereof, especially an antibody according to the present invention, 

c) allowing interaction of said hyperimmune serum reactive antigen or fragment thereof to said 
interaction partner to form a interaction complex, 

d) providing a candidate antagonist, 

e) allowing a competition reaction to occur between the candidate antagonist and the interaction 
complex , 

f) determining whether the candidate antagonist inhibits or reduces the interaction activities of the 
hyperimmune serum reactive antigen or the fragment thereof with the interaction partner. 

The hyperimmune serum reactive antigens or fragments thereof according to the present invention may 
be used for the isolation and/or purification and/or identification of an interaction partner of said 
hyperimmune serum reactive antigen or fragment thereof. 

The present invention also provides a process for in vitro diagnosing a disease related to expression of a 
hyperimmune serum-reactive antigen or a fragment thereof according to the present invention 
comprising determining the presence of a nucleic acid sequence encoding said hyperimmune serum 
reactive antigen and fragment according to the present invention or the presence of the hyperimmune 
serum reactive antigen or fragment thereof according to the present invention. 

The present invention also provides a process for in vitro diagnosis of a bacterial infection^ especially a S. 
pyogenes infection, comprising analyzing for the presence of a nucleic acid sequence encoding said 
hyperimmune serum reactive antigen and fragment according to the present invention or the presence of 
the hyperimmune serum reactive antigen or fragment thereof according to the present invention. 

Moreover, the present invention provides the use of a hyperimmune serum reactive antigen or fragment 
thereof according to the present invention for the generation of a peptide binding to said hyperimmune 
serum reactive antigen or fragment thereof, wherein the peptide is an anticaline. 

The present invention also provides the use of. a hyperimmune serum-reactive antigen or fragment 
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thereof according to the present invention for the manufacture of a functional nucleic acid, wherein the 
functional nucleic acid is selected from the group comprising aptamers and spiegelmers. 

The nucleic acid molecule according to the present invention may also be used for the manufacture of a 
functional ribonucleic acid, wherein the functional ribonucleic acid is selected from the group comprising 
ribozymes, antisense nucleic acids and siRNA. 

The present invention advantageously provides an efficient, relevant and comprehensive set of isolated 
nucleic acid molecules and their encoded hyperimmune serum reactive antigens and fragments thereof 
identified from S. pyogenes using an antibody preparation from multiple human plasma pools and surface 
expression libraries derived from the genome of S. pyogenes. Thus, the present invention fulfils a widely 
felt demand for S. pyogenes antigens, vaccines, diagnostics and products useful in procedures for 
preparing antibodies and for identifying compounds effective against S, pyogenes infection. 

An effective vaccine should be composed of proteins or polypeptides, which are expressed by all strains 
and are able to induce high affinity, abundant antibodies against cell surface components of S. pyogenes. 
The antibodies should be IgGl and/or IgG3 for opsonization, and any IgG subtype and IgA for 
neutralisation of adherence and toxin action. A chemically defined vaccine must be definitely superior 
compared to a whole cell vaccine (attenuated or killed), since components of S. pyogenes, which cross- 
react with human tissues or inhibit opsonization {Whitnack, E. et al., 1985} can be eliminated, and the 
individual proteins inducing protective antibodies and/or a protective immune response can be selected. 

The approach, which has been employed for the present invention, is based on the interaction of group A 
streptococcal proteins or peptides with the antibodies present in human sera. The antibodies produced 
against S. pyogenes by the human immune system and present in human sera are indicative of the in vivo 
expression of the antigenic proteins and their immunogenicity. In addition, the antigenic proteins as 
identified by the bacterial surface display expression libraries using pools of pre-selected sera, are 
processed in a second and third round of screening by individual selected or generated sera. Thus the 
present invention supplies an efficient, relevant, comprehensive set of group A streptococcal antigens as a 
pharmaceutical composition, especially a vaccine preventing infection by S, pyogenes. 

In the antigen identification program for identifying a comprehensive set of antigens according to the 
present invention, at least two different bacterial surface expression libraries are screened with several 
serum pools or plasma fractions or other pooled antibody containing body fluids (antibody pools). The 
antibody pools are derived from a serum collection, which has been tested against antigenic compounds 
of S. pyogenes, such as whole cell extracts and culture supernatant proteins. Preferably, 2 distinct serum 
collections are used: 1. With very stable antibody repertoire: normal adults, clinically healthy people, who 
are non-carriers and overcame previous encounters or currently carriers of S. pyogenes without acute 
disease and symptoms, 2. With antibodies induced acutely by the presence of the pathogenic organism: 
patients with acute disease with different manifestations (e.g. S. pyogenes pharyngitis, woimd infection 
and bacteraemia). Sera have to react with multiple group A streptococci-specific antigens in order to be 
considered hyperimmune and therefore relevant in the screening method applied for the present 
invention. The antibodies produced against streptococci by the human immune system and present in 
human sera are indicative of the in vivo expression of the antigenic proteins and their immunogenicity. 

The expression libraries as used in the present invention should allow expression of all potential antigens, 
e.g. derived from all surface proteins of S. pyogenes. Bacterial surface display libraries will be represented 
by a recombinant library of a bacterial host displaying a (total) set of expressed peptide sequences of 
group A streptococci on a number of selected outer membrane proteins (LamB, BtuB, FhuA) at the 
bacterial host membrane {Georgiou, G., 1997; Etz, H. et al., 2001}. One of the advantages of using 
recombinant expression libraries is that the identified hyperimmune serum-reactive antigens may be 
instantly produced by e^jcpression of the coding sequences of the screened and selected clones expressing 
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the h3^erimmime serum-reactive antigens without further recombinant DNA technology or cloning 
steps necessary. 

The comprehensive set of antigens identified by the described program according to the present 

invention is analysed further b}^ one or more additional rounds of screening. Therefore individual 
antibody preparations or antibodies generated against selected peptides which were identified as 
immunogenic are used. According to a preferred embodiment the individual antibody preparations for 
the second round of screening are derived from patients who have suffered from an acute infection with 
group A streptococci^ especialty from patients who show an antibody titer above a certain minimum 
level, for example an antibody titer being higher than 80 percentile, preferably higher than 90 percentile, 
especially higher than 95 percentile of the human (patient or healthy individual) sera tested. Using such 
high titer individual antibody preparations in the second screening round allows a very selective 
identification of the hyperimmune serum-reactive antigens and fragments thereof from S. pyogenes. 

Following the high throughput screening procedure, the selected antigenic proteins, expressed as 
recombinant proteins or in vitro translated products, in case it can not be expressed in prokar5^otic 
expression systems, or the identified antigenic peptides (produced synthetically) are tested in a second 
screening by a series of ELISA and Western blotting assays for the assessment of their immxmogenicity 
with a large human serum collection (> 100 ujiinfected, > 50 patients sera). 

It is important that the individual antibody preparations (which may also be the selected serum) allow a 
selective identification of the hyperimmune serum-reactive antigens from all the promising candidates 
from the first round. Therefore, preferably at least 10 individual antibody preparations (i.e. antibody 
preparations (e.g. sera) from at least 10 different individuals having suffered from an infection to the 
chosen pathogen) should be used in identifying these antigens in the second screening round. Of course, 
it is possible to use also less than 10 individual preparations, however, selectivity of the step may not be 
optimal with a low number of individual antibody preparations. On the other hand, if a given 
hyperimmune serum-reactive antigen (or an antigenic fragment thereof) is recognized by at least 10 
individual antibody preparations, preferably at least 30, especially at least 50 individual antibody 
preparations, identification of the hyperimmune serum-reactive antigen is also selective enough for a 
proper identification. Hyperimmune serum-reactivity may of course be tested with as many individual 
preparations as possible (e.g. with more than 100 or even with more than 1,000). 

Therefore, the relevant portion of the hyperimmune serum-reactive antibody preparations according to 
the method of the present invention should preferably be at least 10, more preferred at least 30, especially 
at least 50 individual antibody preparations. Alternatively (or in combination) hyperimmune serum- 
reactive antigens may preferably be also identified with at least 20%, preferably at least 30%, especially at 
least 40% of all individual antibody preparations used in the second screening round. 

According to a preferred embodiment of the present invention, the sera from which the individual 
antibody preparations for the second roxmd of screening are prepared (or which are used as antibody 
preparations), are selected by their titer against S. pyogenes (e.g. against a preparation of this pathogen, 
such as a lysate, cell wall components and recombinant proteins). Preferably, some are selected with a 
total IgA titer above 4,000 U, especially above 6,000 U, and/or an IgG titer above 10,000 U, especially 
above 12,000 U (U xmits, calculated from the OD405nm reading at a given dilution) when the whole 
, organism (total lysate or whole cells) is used as antigen in the ELISA. 

The antibodies produced against streptococci by the human immune system and present in human sera 
are indicative of the in vivo expression of the antigenic proteins and their immunogenicity. The 
recognition of linear epitopes by antibodies can be based on sequences as short as 4-5 amino acids. Of 
course it does not necessarily mean that these short peptides are capable of inducing the given antibody 
in vivo. For that reason the defined epitopes, pol5rpeptides and proteins are further to be tested in 
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animals (mainly in mice) for their capacity to induce antibodies against the selected proteins in vivo. 

The preferred antigens are located on the cell surface or secreted, and are therefore accessible 
extracellularly. Antibodies against cell wall proteins are expected to serve two purposes: to inhibit 
adhesion and to promote phagocytosis. Antibodies against secreted proteins are beneficial in 
neutralisation of their function as toxin or virulence component. It is also known that bacteria 
communicate with each other through secreted proteins. Neutralizing antibodies against these proteins 
will interrupt growth-promoting cross-talk between or within streptococcal species. Bioinformatic 
analyses (signal sequences, cell wall localisation signals, transmembrane domains) proved to be very 
useful in assessing cell surface localisation or secretion. The experimental approach includes the isolation 
of antibodies with the corresponding epitopes and proteins from human serum, and the generation of 
immune sera in mice against (poly)peptides selected by the bacterial surface display screens. Tlaese sera 
are then used in a third round of screening as reagents in the following assays: cell surface staining of 
group A streptococci grown under different conditions (FACS, microscopy), determination of 
neutralizing capacity (toxin, adherence), and promotion of opsonization and phagocytosis (in vitro 
phagocytosis assay). 

For that purpose, bacterial £. coli clones are directly injected into mice and immime sera taken and tested 
in the relevant in vitro assay for functional opsonic or neutralizing antibodies. Alternatively, specific 
antibodies may be purified from human or mouse sera using peptides or proteins as substrate. 

Host defence against S. pyogenes relies mainly on innate immunological mechanisms. Inducing high 
affinity antibodies of the opsonic and neutralizing type by vaccination helps the irmate immune system to 
eliminate bacteria and toxins. This makes the method according to the present invention an optimal tool 
for the identification of group A streptococcal antigenic proteins. 

The skin and mucous membranes are formidable barriers against invasion by streptococci. However, 
once the skin or the mucous membranes are breached the first line of non-adaptive cellular defence 
begins its co-ordinate action through complement and phagocytes, especially the polymorphonuclear 
leukocytes (PMNs). These cells can be regarded as the cornerstones in eliminating invading bacteria. As 
group A streptococci are primarily extracellular pathogens, the major anti-streptococcal adaptive 
response comes from the humoral arm of the immune system, and is mediated through three major 
mechanisms: promotion of opsonization, toxin neutralisation, and inhibition of adherence. It is believed 
that opsonization is especially important, because of its requirement for an effective phagocytosis. For 
efficient opsonization the microbial surface has to be coated with antibodies and complement factors for 
recognition by PMNs through receptors to the Fc fragment of the IgG molecule or to activated C3b. After 
opsonization, streptococci are phagocytosed and killed. Antibodies bound to specific antigens on the cell 
surface of bacteria serve as ligands for the attachment to PMNs and to promote phagocytosis. The very 
same antibodies bound to the adhesins and other cell surface proteins are expected to neutralize adhesion 
and prevent colonization. The selection of antigens as provided by the present invention is thus well 
suited to identify those that will lead to protection against infection in an animal model or in humans. 

According to the antigen identification method used herein, the present invention can surprisingly 
provide a set of comprehensive novel nucleic acids and novel hyperimmune serum reactive antigens and 
fragments thereof of S. pyogenes, among other things, as described below. According to one aspect, the 
invention particular^ relates to the nucleotide sequences encoding hyperimmune serum reactive 
antigens which sequences are set forth in the Sequence listing Seq ID No: 1-150 and the corresponding 
encoded amino acid sequences representing hyperimmune serum reactive antigens are set forth in the 
Sequence Listing Seq ID No 151-300. 

In a preferred embodiment of the present invention, a nucleic acid molecule is provided which exhibit 
70% identity over their entire length to a nucleotide sequence set forth with Seq ID No 1, 4-8, 10-18, 20, 
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22, 24-32, 34-35, 38-40, 43-46, 49-51, 53-54, 57-61, 63, 65-71, 73, 7b-77, 81-82, 88, 91-94 and 96-150. Most 

highly preferred are nucleic acids that comprise a region that is at least 80% or at least 85% identical over 
their entire length to a nucleic acid molecule set forth with Seq ID No 1, 4-8, 10-18, 20, 22, 24-32, 34-35, 38- 
40, 43-46, 49-51, 53-54, 57-61, 63, 65-71, 73, 73-77, 81-82, 88, 91-94 and 96-150. In this regard, nucleic acid 
molecules at least 90%, 91%, 92%, 93%, 94%, 95%, or 96% identical over their entire length to the same are 
particularly preferred. Furthermore, those with at least 97% are highly preferred, those with at least 98% 
and at least 99% are particularly highly preferred, with at least 99% or 99.5% being the more preferred, 
with 100% identity being especially preferred. Moreover, preferred embodiments in this respect are 
nucleic acids which encode hyperimmune serum reactive antigens or fragments thereof (polypeptides) 
which retain substantially the same biological function or activity as the mature polypeptide encoded by 
said nucleic acids set forth in the Seq ID No 1, 4-8, 10-18, 20, 22, 24-32, 34-35, 38-40, 43-46, 49-51, 53-54, 57- 
61, 63, 65-71, 73, 7b-77, 81-82, 88, 91-94 and 96-150. 

Identity, as known in the art and used herein, is the relationship between two or more polj^peptide 
sequences or two or more polynucleotide sequences, as determined by comparing the sequences. In the 
art, identity also means the degree of sequence relatedness between polypeptide or pol5mucleotide 
sequences, as the case may be, as determined by the match between strings of such sequences. Identity 
can be readily calculated. While there exist a number of methods to measure identity between two 
polynucleotide or two polypeptide sequences, the term is well known to skilled artisans (e.g. Sequence 
Analysis in Molecular Biology, von Heinje, G., Academic Press, 1987). Preferred methods to determine 
identity are designed to give the largest match between the sequences tested. Methods to determine 
identity are codified in computer programs. Preferred computer program methods to determine identity 
between two sequences include, but are not limited to, GCG program package {Devereux, J. et al., 1984}, 
BLASTP, BLASTN, and FASTA {Altschul, S. et al., 1990}. 

According to another aspect of the invention, nucleic acid molecules are provided which exhibit at least 
96% identity to the nucleic acid sequence set forth with Seq ID No 64. 

According to a further aspect of the present invention, nucleic acid molecules are provided which are 
identical to the nucleic acid sequences set forth with Seq ID No 3, 36, 47-48, 55, 62, 72, 80, 84, 95. 

The nucleic acid molecules according to the present invention can as a second alternative also be a nucleic 
acid molecule which is at least essentially complementary to the nucleic acid described as the first 
alternative above. As used herein complementary means that a nucleic acid strand is base pairing via 
Watson-Crick base pairing with a second nucleic acid strand. Essentially complementary as used herein 
means that the base pairing is not occurring for all of the bases of the respective strands but leaves a 
certain number or percentage of the bases unpaired or wrongly paired. The percentage of correctly 
pairing bases is preferably at least 70 %, more preferably 80 %, even more preferably 90 % and most 
preferably any percentage higher than 90 %. It is to be noted that a percentage of 70 % matching bases is 
considered as homology and the hybridization having this extent of matching base pairs is considered as 
stringent. Hybridization conditions for this kind of stringent hybridization may be taken from Current 
Protocols in Molecular Biology (John Wiley and Sons, Inc., 1987). More particularly, the hybridization 
conditions can be as follows: 

© Hybridization performed e.g. in 5 x SSPE, 5 x Denhardt's reagent, 0.1% SDS, 100 g/mL sheared 

DNA at 68°C 

© Moderate stringency wash in 0.2xSSC, 0.1% SDS at 42°C 
o High stringency wash in O.lxSSC, 0.1% SDS at 68°C 

Genomic DNA with a GC content of 50% has an approximate Tm of 96'*C. For 1% mismatch, the Tm is 
reduced by approximately 1°C. 
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In addition, any of the further hybridization conditions described herein are in principle applicable as 
well. 

Of course, all nucleic acid sequence molecules which encode for the same polypeptide molecule as those 
identified by the present invention are encompassed by any disclosure of a given coding sequence, since 
the degeneracy of the genetic code is directly applicable to unambiguously determine all possible nucleic 
acid molecules which encode a given polypeptide molecule, even if the number of such degenerated 
nucleic acid molecules may be high. This is also applicable for fragments of a given polypeptide, as long 
as the fragments encode for a polypeptide being suitable to be used in a vaccination connection, e.g. as an 
active or passive vaccine. 

The nucleic acid molecule according to the present invention can as a third alternative also be a nucleic 
acid which comprises a stretch of at least 15 bases of the nucleic acid molecule according to the first and 
second alternative of the nucleic acid molecules according to the present invention as outlined above. 
Preferably, the bases form a contiguous stretch of bases. However, it is also within the scope of the 
present invention that the stretch consists of two or more moieties which are separated by a number of 
bases. 

The nucleic acid molecule according to the present invention can as a fourth alternative also be a nucleic 
acid molecule which anneals under stringent hybridisation conditions to any of the nucleic acids of the 
present invention according to the above outlined first, second, and third alternative. Stringent 
hybridisation conditions are t3^ically those described herein. 

Finally, the nucleic acid molecule according to the present invention can as a fifth alternative also be a 
nucleic acid molecule which, but for the degeneracy of the genetic code, would hybridise to any of the 
nucleic acid molecules according to any nucleic acid molecule of the present invention according to the 
first, second, third, and fourth alternative as outlined above. This kind of nucleic acid molecule refers to 
the fact that preferably the nucleic acids according to the present invention code for the hyperimmune 
serum reactive antigens or fragments thereof according to the present invention. This kind of nucleic acid 
molecule is particularly useful in the detection of a nucleic acid molecule according to the present 
invention and thus the diagnosis of the respective microorganisms such as S. pyogenes and any disease or 
diseased condition where this kind of microorganims is involved. Preferably, the hybridisation would 
occur or be preformed under stringent conditions as described in connection with the fourth alternative 
described above. 

Nucleic acid molecule as used herein generally refers to any ribonucleic acid molecule or 
deoxyribonucleic acid molecule, which may be unmodified RNA or DNA or modified RNA or DNA. 
Thus, for instance, nucleic acid molecule as used herein refers to, among other, single-and double- 
stranded DNA, DNA that is a mixture of single- and double-stranded RNA, and RNA that is a mixture of 
single- and double-stranded regions, hybrid molecules comprising DNA and RNA that may be single- 
stranded or, more typically, double-stranded, or triple-stranded, or a mixture of single- and double- 
stranded regions. In addition, nucleic acid molecule as used herein refers to triple-stranded regions 
comprising RNA or DNA or both RNA and DNA. The strands in such regions may be from the same 
molecule or from different molecules. The regions may include all of one or more of the molecules, but 
more typically involve only a region of some of the molecules. One of the molecules of a triple-helical 
region often is an oligonucleotide. As used herein, the term nucleic acid molecule includes DNAs or 
RNAs as described above that contain one or more modified bases. Thus, DNAs or RNAs with backbones 
modified for stability or for other reasons are ''nucleic acid molecule'' as that term is intended herein. 
Moreover, DNAs or RNAs comprising unusual bases, such as inosine, or modified bases, such as 
tritylated bases, to name just two examples, are nucleic acid molecule as the term is used herein. It will be 
appreciated that a great variety of modifications have been made to DNA and RNA that serve many 
useful purposes known to those of skill in the art. The term nucleic acid molecule as it is employed herein 
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embraces such chemically, enzymatically or metabolically modified forms of nucleic acid molecule, as 
well as the chemical forms of DNA and RNA characteristic of viruses and cells, including simple and 
complex cells, inter alia. The term nucleic acid molecule also embraces short nucleic acid molecules often 
referred to as oligonucleotide(s). "Polynucleotide'' and "nucleic acid" or "nucleic acid molecule" are often 
used interchangeably herein. 

Nucleic acid molecules provided in the present invention also encompass numerous unique fragments, 
both longer and shorter than the nucleic acid molecule sequences set forth in the sequencing listing of the 
S. pyogenes coding regions, which can be generated b}^ standard cloning methods. To be unique, a 
fragment must be of sufficient size to distinguish it from other known nucleic acid sequences, most 
readily determined by comparing any selected S. pyogenes fragment to the nucleotide sequences in 
computer databases such as GenBank. 

Additionally, modifications can be made to the nucleic acid molecules and polypeptides that are 
encompassed by the present invention. For example, nucleotide substitutions can be made which do not 
affect the polypeptide encoded by the nucleic acid, and thus any nucleic acid molecule which encodes a 
hyperimmune serum reactive antigen or fragments thereof is encompassed by the present invention. 

Furthermore, any of the nucleic acid molecules encoding hyperimmune serum reactive antigens or 
fragments thereof provided by the present invention can be functionally linked, using standard 
techniques such as standard cloning techniques, to any desired regulatory sequences, whether a S. 
pyogenes regulatory sequence or a heterologous regulatory sequence, heterologous leader sequence, 
heterologous marker sequence or a heterologous coding sequence to create a fusion protein. 

Nucleic acid molecules of the present invention may be in the form of RNA, such as mRNA or cRNA, or 
in the form of DNA, including, for instance, cDNA and genomic DNA obtained by cloning or produced 
by chemical synthetic techniques or by a combination thereof. The DNA may be triple-stranded, double- 
stranded or single-stranded. Single-stranded DNA may be the coding strand, also known as the sense 
strand, or it may be the non-coding strand, also referred to as the anti-sense strand. 

The present invention further relates to variants of the herein above described nucleic acid molecules 
which encode fragments, analogs and derivatives of the hyperimmune serum reactive antigens and 
fragments thereof having a deducted S. pyogenes amino acid sequence set forth in the Sequence Listing. A 
variant of the nucleic acid molecule may be a naturally occurring variant such as a naturally occurring 
allelic variant, or it may be a variant that is not known to occur naturally. Such non-naturally occurring 
variants of the nucleic acid molecule may be made by mutagenesis techniques, including those applied to 
nucleic acid molecules, cells or organisms. 

Among variants in this regard are variants that differ from the aforementioned nucleic acid molecules by 
nucleotide substitutions, deletions or additions. The substitutions, deletions or additions may involve one 
or more nucleotides. The variants may be altered in coding or non-coding regions or both. Alterations in 
the coding regions may produce conservative or non-conservative amino acid substitutions, deletions or 
additions. Preferred are nucleic acid molecules encoding a variant, analog, derivative or fragment, or a 
variant, analogue or derivative of a fragment, which have a S. pyogenes sequence as set forth in the 
Sequence Listing, in which several, a few, 5 to 10, 1 to 5, 1 to 3, 2, 1 or no amino acid(s) is substituted, 
deleted or added, in an}^ combination. Especially preferred among these are silent substitutions, additions 
and deletions, which do not alter the properties and activities of the S, pyogenes polypeptides set forth in 
the Sequence Listing. Also especially preferred in this regard are conservative substitutions. 

The peptides and fragments according to the present invention also include modified epitopes wherein 
preferably one or two of the amino acids of a given epitope are modified or replaced according to the 
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rules disclosed in e.g. {Tourdot, S. et al., 2000}, as well as the nucleic acid sequences encoding such 
modified epitopes. 

It is clear that also epitopes derived from the present epitopes by amino acid exchanges improving, 
conserving or at least not significantly impeding the T cell activating capability of the epitopes are 

covered by the epitopes according to the present invention. Therefore the present epitopes also cover 
epitopes, which do not contain the original sequence as derived from S. pyogenes, but trigger the same or 
preferably an improved T cell response. These epitope are referred to as "heteroclitic"; they need to have a 
similar or preferably greater affinity to MHC/HLA molecules, and the need the abilit}^ to stimulate the T 
cell receptors (TCR) directed to the original epitope in a similar or preferably stronger manner. 

Heteroclitic epitopes can be obtained by rational design i.e. taking into account the contribution of 
individual residues to binding to MHC/HLA as for instance described by {Rammensee, H. et al., 1999}, 
combined with a systematic exchange of residues potentially interacting with the TCR and testing the 
resulting sequences with T cells directed against the original epitope. Such a design is possible for a 
skilled man in the art without much experimentation. 

Another possibility includes the screening of peptide libraries with T cells directed against the original 
epitope. A preferred way is the positional scanning of synthetic peptide libraries. Such approaches have 
been described in detail for instance by {Hemmer, B. et aL, 1999}and the references given therein. 

As an altemative to epitopes represented by die present derived amino acid sequences or heteroclitic 
epitopes, also substances mimicking these epitopes e.g. "peptidemimetica" or "retro-inverso-peptides" can 
be applied. 

Another aspect of the design of improved epitopes is their formulation or modification with substances 
increasing their capacity to stimulate T cells. These include T helper cell epitopes, lipids or liposomes or 
preferred modifications as described in WO 01/78767. 

Another way to increase the T cell stimulating capacity of epitopes is their formulation with immune 
stimulating substances for instance cytokines or chemokines like interleukin-2, -7, -12, -18, class I and H 
interferons (IFN), especially IFN-gamma, GM-CSF, TNF-alpha, flt3-ligand and others. 

As discussed additionally herein regarding nucleic acid molecule assays of the invention, for instance, 
nucleic acid molecules of the invention as discussed above, may be used as a hybridization probe for 
RNA, cDNA and genomic DNA to isolate full-length cDNAs and genomic clones encoding polypeptides 
of the present invention and to isolate cDNA and genomic clones of other genes that have a high 
sequence similarity to the nucleic acid molecules of the present invention. Such probes generally will 
comprise at least 15 bases. Preferably, such probes will have at least 20, at least 25 or at least 30 bases, and 
may have at least 50 bases. Particularly preferred probes will have at least 30 bases, and will have 50 
bases or less, such as 30, 35, 40, 45, or 50 bases. 

For example, the coding region of a nucleic acid molecule of the present invention may be isolated by 
screening a relevant library using the known DNA sequence to synthesize an oligonucleotide probe. A 
labeled oligonucleotide having a sequence complementary to that of a gene of the present invention is 
then used to screen a library of cDNA, genomic DNA or mRNA to determine to which members of the 
library the probe hybridizes. 

The nucleic acid molecules and polypeptides of the present invention may be employed as reagents and 
materials for development of treatments of and diagnostics for disease, particularly human disease, as 
further discussed herein relating to nucleic acid molecule assays, inter alia. 
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The nucleic acid molecules of the present invention that are oligonucleotides can be used in the processes 
herein as described, but preferably for PGR, to determine whether or not the S, pyogenes genes identified 
herein in whole or in part are present and/or transcribed in infected tissue such as blood. It is recognized 
that such sequences will also have utility in diagnosis of the stage of infection and type of infection the 
pathogen has attained. For this and other purposes the arrays comprising at least one of the nucleic acids 
according to the present invention as described herein, may be used. 

The nucleic acid molecules according to the present invention may be used for the detection of nucleic 
acid molecules and organisms or samples containing these nucleic acids. Preferably such detection is for 
diagnosis, more preferable for the diagnosis of a disease related or linked to the present or abundance of 
S. pyogenes. 

Eukaryotes (herein also "individual(s)''), particularly mammals, and especially humans, infected with S. 
pyogenes may be detected at the DNA level by a variety of techniques. Preferred candidates for 
distinguishing a S. pyogenes from other organisms can be obtained. 

The invention provides a process for diagnosing disease, arising from infection with S, pyogenes, 
comprising determining from a sample isolated or derived from an individual an increased level of 
expression of a nucleic acid molecule having the sequence of a nucleic acid molecule set forth in the 
Sequence Listing. Expression of nucleic acid molecules can be measured using any one of the methods 
well known in the art for the quantitation of nucleic acid molecules, such as, for example, PGR, RT-PGR, 
Rnase protection. Northern blotting, other hybridisation methods and the arrays described herein. 

Isolated as used herein means separated "by the hand of man" from its natural state; i.e., that, if it occurs 
in nature, it has been changed or removed from its original environment, or both. For example, a 
naturally occurring nucleic acid molecule or a polypeptide naturally present in a living organism in its 
natural state is not '^isolated," but the same nucleic acid molecule or polypeptide separated from the 
coexisting materials of its natural state is "isolated", as the term is employed herein. As part of or 
following isolation, such nucleic acid molecules can be joined to other nucleic acid molecules, such as 
DNAs, for mutagenesis, to form fusion proteins, and for propagation or expression in a host, for instance. 
The isolated nucleic acid molecules, alone or joined to other nucleic acid molecules such as vectors, can be 
introduced into host cells, in culture or in whole organisms. Introduced into host cells in culture or in 
whole organisms, such DNAs still would be isolated, as the term is used herein, because they would not 
be in their naturally occurring form or environment. Similarly, the nucleic acid molecules and 
polypeptides may occur in a composition, such as a media formulations, solutions for introduction of 
nucleic acid molecules or polypeptides, for example, into cells, compositions or solutions for chemical or 
enzymatic reactions, for instance, which are not naturally occurring compositions, and, therein remain 
isolated nucleic acid molecules or polypeptides within the meaning of that term as it is employed herein. 

The nucleic acids according to the present invention may be chemically synthesized. Alternatively, the 
nucleic acids can be isolated from S. pyogenes by methods known to the one skilled in the art. 

According to another aspect of the present invention, a comprehensive set of novel hyperimmune serum 
reactive antigens and fragments thereof are provided by using the herein described antigen identification 
method. In a preferred embodiment of the invention, a hyperimmune serum-reactive antigen comprising 
an amino acid sequence being encoded by any one of the nucleic acids molecules herein described and 
fragments thereof are provided. In another preferred embodiment of the invention a novel set of 
hyperimmune serum-reactive antigens which comprises amino acid sequences selected from a group 
consisting of the polypeptide sequences as represented in Seq ID No 151, 154-158, 160-168, 170, 172, 174- 
182, 184-185, 188-190, 193-196, 199-201, 203-204, 207-211, 213, 215-221, 223, 225-227, 231-232, 238, 241-244 
and 246-300 and fragments thereof are provided. In a further preferred embodiment of the invention 
hyperimmune serum-reactive antigens which comprise amino acid sequences selected from a group 
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consisting of the pol5^eptide sequences as represented in Seq ID No214 and fragments thereof are 
provided. In a still preferred embodiment of the invention hyperimmune serum-reactive antigens which 
comprise amino acid sequences selected from a group consisting of the polypeptide sequences as 
represented in Seq ID No 153, 186, 197-198, 205, 212, 222, 230, 234, 245. and fragments thereof are 
provided. 

The hyperimmune serum reactive antigens and fragments thereof as provided in the invention include 
any polypeptide set forth in the Sequence Listing as well as polypeptides which have at least 70% identity 
to a polypeptide set forth in the Sequence Listing, preferably at least 80% or 85% identity to a polypeptide 
set forth in the Sequence Listing, and more preferably at least 90% similarity (more preferably at least 
90% identity) to a polypeptide set forth in the Sequence Listing and still more preferably at least 95%, 
96%, 97%, 98%, 99% or 99.5% similarity (still more preferably at least 95%, 96%, 97%, 98%, 99%, or 99.5% 
identity) to a polypeptide set forth in the Sequence Listing and also include portions of such polypeptides 
with such portion of the polypeptide generally contairung at least 4 amino acids and more preferably at 
least 8, still more preferably at least 30, still more preferably at least 50 amino acids, such as 4, 8, 10, 20, 
30, 35, 40, 45 or 50 amino acids. 

The invention also relates to fragments, analogs, and derivatives of these hyperimmime serum reactive 
antigens and fragments thereof. The terms "fragment", "derivative'' and "analog" when referring to an 
antigen whose amino acid sequence is set forth in the Sequence Listing, means a polypeptide which 
retains essentially the same biological fimction or activity as such hyperimmune serum reactive antigen 
and fragment thereof. 

The fragment, derivative or analog of a hyperimmune serum reactive antigen and fragment thereof may 
be 1) one in which one or more of the amino acid residues are substituted with a conserved or non- 
conserved amino acid residue (preferably a conserved amino acid residue) and such substituted amino 
acid residue may or may not be one encoded by the genetic code, or 2) one in which one or more of the 
amino acid residues includes a substituent group, or 3) one in which the mature hyperimmune serum 
reactive antigen or fragment thereof is fused with another compound, such as a compound to increase the 
half-life of the hyperimmune serum reactive antigen and fragment thereof (for example, polyethylene 
glycol), or 4) one in which the additional amino acids are fused to the mature h5^erimmune serum 
reactive antigen or fragment thereof, such as a leader or secretory sequence or a sequence which is 
employed for purification of the mature hyperimmune serum reactive antigen or fragment thereof or a 
proprotein sequence. Such fragments, derivatives and analogs are deemed to be within the scope of 
those skilled in the art from the teachings herein. 

The present invention also relates to antigens of different S. pyogenes isolates. Such homologues may 
easily be isolated based on the nucleic acid and amino acid sequences disclosed herein. There are more 
than 80 M protein serotypes distinguished to date and the typing is based on the variable region at the 
5' end of the emm gene (see e.g. Vitali et al. 2002). The presence of any antigen can accordingly be 
determined for every M serotype. In addition it is possible to determine the variability of a particular 
antigen in the various M serotypes as described for the sic gene (Hoe et al., 2001). The influence of the 
various M serotypes on the kind of disease it causes is summarized in a recent review (Cuimingham, 
2000). In particular, two groups of serotypes can be distinguished: 

1) Those causing Pharyngitis and Scarlet fever (e.g. M types 1, 3, 5, 6, 14, 18, 19, 24) 

2) Those causing Pyoderma and Streptococcal skin infections (e.g. M types 2, 49, 57, 59, 60, 61) 

This can serve as the basis to identify the relevance of an antigen for the use as a vaccine or in general as a 
drug targeting a specific disease. 

The information e.g. from the homepage of the CDC 

(http://www.cdc.gov/ncidod/biotech/strep/emmtypes-htm ) gives a dendrogram showing the relatedness 
of various M serotypes. Further relevant references are Vitali et al.. Journal of Clinical Microbiology 
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40:679-681. (2002) (molecular emm typing method), Enright et al.. Infection and Immimity 69:2416-2427. 
(2001) (alternative molecular typing method (MLST)) , Hoe et al.. The Journal of Infectious Diseases 
183:633-639. (2001)(example for the variation of one antigen (sic) in many different serotypes) and 
Cunningham, CLINICAL MICROBIOLOGY REVIEWS 13:470-511. (2000)(review on GAS pathogenesis). 
All emm types are completely listed and may be downloaded from the above mentioned address. 

The dendrogram was constructed by sequential use of the Wisconsin Package Version 10,1, Genetics 
Computer Group (GCG), Madison programs Pileup, Distances, and Growtree. Basically, 22 residues of 
signal sequence plus 83 additional N terminal residues were used for the alignments which include 
selected sequences from the database. The selected sequences include new emm designations 103-124 
(described in table below) as well as their closest "classical" M protein matches. Although this analysis is 
limited in that the C terminal ends are truncated arbitrarily^ this is a typical result in that the dendrogram 
separates clusters of opacity factor positive strain M sequences from opacity factor strain negative M 
sequences. 

emm type/previous designation - GenBank accession number - Countries where isolated - Closest N- 
terminal M protein sequence match (% identity): 

emml03/st2034 U74320 PNG, Bra, Egy, Mal,Nep, NZ, US M87 (66%) 
emml04/st2034 AF056300 PNG, Egy, Mal,Nep, NZ, US M66 (72%) 
emml05/st4529 AF060227 Mai, Nep, NZ, US M5 (45%) 
emml06/st4532 AF077666 Mai, Egy, Iran,Nep M27G (71%) 
emml07/st4264 AF163686 Mai, NZ M25 (52%) 

emml08/st4547 AF052426 Mai, Bra, Egy, Ira, NZ M70 (84%) emml09/st3018 AF077667 Mai, Egy, NZ 
M28(74%) 

emmll0/st4935 U92492 Ind, Bui, NZ, Rus, US M13 (60%) 
emmlll/st4973 AF128960 Ind, Bra, Nep, US M80 (40%) 

emmll2/stCmukl6 AF091806 Thi, Bra, Rus, US M27L/77 (59%) emmll3/st2267 AF078068 NZ, Thai, Chi 

M13 (50%) 

emmll4/st2967 U50338 US, Can, Gam, NZ, PNG M73 (80%) 
emmll5/st2980 AF028712 US, Bra, Rus M36 (64%) 
emmll6/st2370 AF156180 US, Nep, NZ M52 (60%) 
emmll7/st436 AF058801 US M13 (59%) 
emmll8/st448 AF058802 US, Bra, Egy, Nep, NZ M49 (79%) 
emmll9/st3365 AF083874 US, Br, Nep M52 (59%) 
emml20/stll35 AF296181 Egy M56 (78%) 
emml21/stll61 AF296182 Egy M64 (64%) 
emml22/stl432 AF222860 Egy, Rus, Nep M18 (40%) 
emml23/st6949AF213451Arg, US, NZM80 (68%) 
stll60/emml24AF149048 and AF018178Egy, Mai, NZM2 (82%) 

Abbreviations: Arg, Argentina; Bra, Brazil; Bui, Bulgaria; Can, Canada; Chi, Chile; Egy, Eg5^t; Gam, 
Gambia; Ind, India; Ira, Iran; Mai, Malaysia; Nep, Nepal; NZ, New Zealand; PNG, Papua New Guinea; 
Thi, Thailand; Rus, Russia; US, United States. %: Closest mature M protein sequence match to predicted 
50 mature N terminal residuesfrom serologically characterized Lancefield type. 

emm types and sequence types: 

In many cases the emm sequence reference strains came directly from the M type collection of Dr. 
Rebecca Lancefield. Such strains are designated RCL. 

The sequences starting with "emm'' indicate that isolates represented by this type have been analyzed by 
several reference laboratories besides the CDC streptococcal laboratories. Each of the "new" emm types 
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emm94 through emml24 are represented by multiple independent isolates recovered from serious 
disease manifestations, are M protein nontypeable with all typing sera stocks available to international 
GAS reference laboratories, and demonstrate antiphagocytic properties in vitro by multiplying in normal 
human blood. Strains with emm sequences starting with "sf' (sequence type) have not yet been 
completely validated by all of the reference laboratories. 

GAS Genetics: 

It has long been known that antisertim against serum opacity factor positive (SOF+) strains inhibits OF 
activity in a strain-specific manner. Therefore, 500-2700 base variable regions of the sof (serum opacity 
factor) gene representing at least 60 distinct sof genes were analysed from GAS opacity factor positive 
strains (and interestingly, a homolog commonly found in OF negative emml2 isolates and emm/M type 
12 reference strain). It was found that sof gene sequences are also remarkably variable among the 
different GAS strains, although usually well conserved within an emm type. Important strains include 
therefore emml, emmlOO, emmlOl, emml02, emml03, emml04, emmlOS, emml06, emLml07, emmlOS, 
emml09, emmll, emmllO, emmlll, emmll2, emmllS, emmll4, emmllS, emmll6, emmll7, emmllS, 
emmll9, emml2, emml20, emml21, emml22, emml23, emml24, emmlSL, emml4, emmlS, emml7, 
emml8, emml9, emm2, emm22, emm23, emm24, emm25, emm26, emm27G, emm28, emm29, emmS, 
emmSO, emmSl, emm32, emm33, emm34, emm36, emm37, emm38, emm39, emm4, emm40, emm41, 
emm42, emm43, emm44, emm46, emm47, emm48, emm49, emmS, emmSO, emmSl, emm52, emm53, 
emm54, emm55, enun56, emm57, emm58, emm59, emm6, emm60, emm61 , emm62 , emm63, emm64, 
emm65, emm66, emm67, emm68,mm69, emm70, emm71, emm72, emm73, emm74, emm75, emm76, 
emm77, emm78, emm79, emmS, emmSO, emmSl , emm82, emm83, emm84, emm85, emm86, emm87, 
emm88, emm89, emm9, emm90, emm91, emm92,emm93, emm94, emm95, emm96, emm97, emm98, 
emm99 ,stl389,stl731,stl759,stl815 , stl967, stl969, stlrp31, stll014, st2037, st204, st211, st213, st2147, 
stl207, st245, st2460, st2461, st2463, st2904, st2911, st2917, st2926, st2940, st369, st3757, st3765, st3850, 
st5282, st6735, st7700, st809,st833,st854 , st980584, stck249, stck401, std432, std631, std633, stIL103, stIL62, 
stns292, stns554, stsl04, stcl400, stcl741, stc36, stc3852, stc5344, stc5345, stc57, stc6979, stc74a, stc839, 
stglO, stgll, stgl389, stgl66b, stgl750, stg2078, stg3390, stg4222, stg4545, stg480, stg4831, stg485, stg4974, 
stg5063, stg6, stg62647, stg643, stg652, stg653, stg663, stg840, stg93464, stg97, stL1376, stL1929 and 
stL2764. 

Among the particularly preferred embodiments of the invention in this regard are the hyperimmune 
serum reactive antigens set forth in the Sequence Listing, variants, analogs, derivatives and fragments 
thereof, and variants, analogs and derivatives of fragments. Additionally, fusion polypeptides 
comprising such hyperimmune serum reactive antigens, variants, analogs, derivatives and fragments 
thereof, and variants, analogs and derivatives of the fragments are also encompassed by the present 
invention. Such fusion polypeptides and proteins, as well as nucleic acid molecules encoding them, can 
readily be made using standard techniques, including standard recombinant techniques for producing 
and expression a recombinant polynucleic acid encoding a fusion protein. 

Among preferred variants are those that vary from a reference by conservative amino acid substitutions. 
Such substitutions are those that substitute a given amino acid in a polypeptide by another amino acid of 
like characteristics. Typically seen as conservative substitutions are the replacements, one for another, 
among the aliphatic amino acids Ala, Val, Leu and lie; interchange of the hydroxyl residues Ser and Thr, 
exchange of the acidic residues Asp and Glu, substitution between the amide residues Asn and Gin, 
exchange of the basic residues Lys and Arg and replacements among the aromatic residues Phe and Tyr. 

Further particularly preferred in this regard are variants, analogs, derivatives and fragments, and 
variants, analogs and derivatives of the fragments, having the amino acid sequence of any polypeptide 
set forth in the Sequence Listing, in which several, a few, 5 to 10, 1 to 5, 1 to 3, 2, 1 or no amino acid 
residues are substituted, deleted or added, in any combination. Especially preferred among these are 
silent substitutions, additions and deletions, which do not alter the properties and activities of the 
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polypeptide of the present invention. Also especially preferred in this regard are conservative 
substitutions. Most highly preferred are polypeptides having an amino acid sequence set forth in the 
Sequence Listing without substitutions. Specifically suitable amino acid substitutions are those which are 
contained in homologues for the sequences disclosed in the Sequence Listing according to the present 
application. A suitable sequence derivative of an antigen or epitope as disclosed herein therefore includes 
one or more variations being present in one or more strains or serotypes of S. pyogenes (preferably 1, 2, 3, 
4v 5, 6, 7, 8, 9, or 10 amino acid exchanges which are based on such homolog variations). Such antigens 
comprise sequences which may be naturally occurring sequences or newly created artificial sequences. 
These preferred antigen variants are based on such naturally occurring sequence variations, e.g. forming 
a '^master sequence" for the antigenic regions of the polypeptides according to the present invention. 
Suitable examples for such homolog variations or exchanges are given in table 5 in the example section. 
For example, a given S.pyogenes sequence may be amended by including such one or more variations 
thereby creating an artificial (i.e. non-naturally occurring) variant of this given (naturally occurring) 
antigen or epitope sequence. 

The h3^perimmune serum reactive antigens and fragments thereof of the present invention are preferably 
provided in an isolated form, and preferably are purified to homogeneity. 

Also among preferred embodiments of the present invention are polypeptides comprising fragments of 
the polypeptides having the amino acid sequence set forth in the Sequence Listing, and fragments of 
variants and derivatives of the polypeptides set forth in the Sequence Listing. 

In this regard a fragment is a polypeptide having an amino acid sequence that entirely is the same as part 
but not all of the amino acid sequence of the afore mentioned hyperimmune serum reactive antigen and 
fragment thereof, and variants or derivative, analogs, fragments thereof. Such fragments may be "free- 
standing'% i.e., not part of or fused to other amino acids or polypeptides, or they may be comprised 
within a larger polypeptide of which they form a part or region. Also preferred in this aspect of the 
invention are fragments characterised by structural or functional attributes of the polypeptide of the 
present invention, i.e. fragments that comprise alpha-helix and alpha-helix forming regions, beta-sheet 
and beta-sheet forming regions, turn and turn-forming regions, coil and coil-forming regions, hydrophilic 
regions, hydrophobic regions, alpha amphipathic regions, beta-amphipathic regions, flexible regions, 
surface-forming regions, substrate binding regions, and high antigenic index regions of the polypeptide 
of the present invention, and combinations of such fragments. Preferred regions are those that mediate 
activities of the hyperimmune serum reactive antigens and fragments thereof of the present invention. 
Most highly preferred in this regard are fragments that have a chemical, biological or other activity of the 
hyperimmune serum reactive antigen and fragments thereof of the present invention, including those 
with a similar activity or an improved activity, or with a decreased undesirable activity. Particularly 
preferred are fragments comprising receptors or domains of enzymes that confer a function essential for 
viability of S. pyogenes or the ability to cause disease in humans. Further preferred polypeptide fragments 
are those that comprise or contain antigenic or immunogenic determinants in an animal, especially in a 
human. 

An antigenic fragment is defined as a fragment of the identified antigen which is for itself antigenic or 
may be made antigenic when provided as a hapten. Therefore, also antigens or antigenic fragments 
showing one or (for longer fragments) only a few amino acid exchanges are enabled with the present 
invention, provided that the antigenic capacities of such fragments with amino acid exchanges are not 
severely deteriorated on the exchange(s), i.e., suited for eliciting an appropriate immune response in an 
individual vaccinated with this antigen and identified by individual antibody preparations from 
individual sera. 

Preferred examples of such fragments of a hyperimmune serum-reactive antigen are selected from the 
group consisting of peptides comprising amino acid sequences of column "predicted immunogenic aa". 
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and "Location of identified immtmogenic region'' of Table 1; the serum reactive epitopes of Table 2, 

especially peptides comprising amino acid 4-44, 57-65, 67-98, 101-107, 109-125, 131-144, 146-159, 168-173, 
181-186, 191-200, 206-213, 229-245, 261-269, 288-301, 304-317, 323-328, 350-361, 374-384, 388-407, 416-425 
and 1-114 of Seq ID No 151; 5-17, 49-64, 77-82, 87-98, 118-125, 127-140, 142-150, 153-159, 191-207, 212-218, 
226-270, 274-287, 297-306, 325-331, 340-347, 352-369, 377-382, 390-395 and 29-226 of Seq ID No 152; 4-16, 
20-26, 32-74, 76-87, 93-108, 116-141, 148-162, 165-180, 206-219, 221-228, 230-236, 239-245, 257-268, 313-328, 
330-335, 353-359, 367-375, 394-403, 414-434, 437-444, 446-453, 456-464, 478-487, 526-535, 541-552, 568-575, 
577-584, 589-598, 610-618, 624-643, 653-665, 667-681, 697-718, 730-748, 755-761, 773-794, 806-821, 823-831, 
837-845, 862-877, 879-889, 896-919, 924-930, 935-940, 947-955, 959-964, 969-986, 991-1002, 1012-1036, 1047- 
1056, 1067-1073, 1079-1085, 1088-1111, 1130-1135, 1148-1164, 1166-1173, 1185-1192, 1244-1254 and 919-929 
of Seq ID No 153; 5-44, 62-74, 78-83, 99-105, 107-113, 124-134, 161-174, 176-194, 203-211, 216-237, 241-247, 
253-266, 272-299, 323-349, 353-360 and 145-305 of Seq ID No 154; 15-39, 52-61, 72-81, 92-97 and 71-81 of 
Seq ID No 155; 13-19, 21-31, 40-108, 115-122, 125-140, 158-180, 187-203, 210-223, 235-245 and 173-186 of 
Seq ID No 156; 5-12, 19-27, 29-39, 59-67, 71-78, 80-88, 92-104, 107-124, 129-142, 158-168, 185-191, 218-226, 
230-243, 256-267, 272-277, 283-291, 307-325, 331-344, 346-352 and 316-331 of Seq ID No 157; 6-28, 43-53, 
60-76, 93-103 and 21-99 of Seq ID No 158; 10-30, 120-126, 145-151, 159-169, 174-182, 191-196, 201-206, 214- 
220, 222-232, 254-272, 292-307, 313-323, 332-353, 361-369, 389-396, 401-415, 428-439, 465-481, 510-517, 560- 
568 and 9-264 of Seq ID No 159; 5-29, 39-45, 107-128 and 1-112 of Seq ID No 160; 4-38, 42-50, 54-60, 65-71, 
91-102 and 21-56 of Seq ID No 161; 4-13, 19-25, 41-51, 54-62, 68-75, 79-89, 109-122, 130-136, 172-189, 192- 
198, 217-224, 262-268, 270-276, 281-298, 315-324, 333-342, 353-370, 376-391 and 23-39 of Seq ID No 162; 6- 
41, 49-58, 62-103, 117-124, 147-166, 173-194, 204-211, 221-229, 255-261, 269-284, 288-310, 319-325, 348-380, 
383-389, 402-410, 424-443, 467-479, 496-517, 535-553, 555-565, 574-581, 583-591 and 474-489 of Seq ID No 
163; 8-35, 52-57, 66-73, 81-88, 108-114, 125-131, 160-167, 174-180, 230-235, 237-249, 254-262, 278-285, 308- 
314, 321-326, 344-353, 358-372, 376-383, 393-411, 439-446, 453-464, 471-480, 485-492, 502-508, 523-529, 533- 
556, 558-563, 567-584:, 589-597, 605-619, 625-645, 647-666, 671-678, 690-714, 721-728, 741-763, 766-773, 777- 
787, 792-802, 809-823, 849-864 and 37-241, 409-534, 582-604, 743-804 of Seq ID No 164; 4-17, 24-36, 38-44, 
59-67, 72-90, 92-121, 126-149, 151-159, 161-175, 197-215, 217-227, 241-247, 257-264, 266-275, 277-284, 293- 
307, 315-321, 330-337, 345-350, 357-366, 385-416 and 202-337 of Seq ID No 165; 4-20, 22-46, 49-70, 80-89, 
96-103, 105-119, 123-129, 153-160, 181-223, 227-233, 236-243, 248-255, 261-269, 274-279, 283-299, 305-313, 
315-332, 339-344, 349-362, 365-373, 380-388, 391-397, 402-407 and 1-48 of Seq ID No 166; 18-37, 41-63, 100- 
106, 109-151, 153-167, 170-197, 199-207, 212-229, 232-253, 273-297 and 203-217 of Seq ID No 167; 20-26, 54- 
61, 80-88, 94-101, 113-119, 128-136, 138-144, 156-188, 193-201, 209-217, 221-229, 239-244, 251-257, 270-278, 
281-290, 308-315, 319-332, 339-352, 370-381, 388-400, 411-417, 426-435, 468-482, 488-497, 499-506, 512-521 
and 261-273 of Seq ID No 168; 6-12, 16-36, 50-56, 86-92, 115-125, 143-152, 163-172, 193-203, 235-244, 280- 
289, 302-315, 325-348, 370-379, 399-405, 411-417, 419-429, 441-449, 463-472, 482-490, 500-516, 536-543, 561- 
569, 587-594, 620-636, 647-653, 659-664, 677-685, 687-693, 713-719, 733-740, 746-754, 756-779, 792-799, 808- 
817, 822-828, 851-865, 902-908, 920-938, 946-952, 969-976, 988-1005, 1018-1027, 1045-1057, 1063-1069, 1071- 
1078, 1090-1099, 1101-1109, 1113-1127, 1130-1137, 1162-1174, 1211-1221, 1234-1242, 1261-1268, 1278-1284, 
1312-1317, 1319-1326, 1345-1353, 1366-1378, 1382-1394, 1396-1413, 1415-1424, 1442-1457, 1467-1474, 1482- 
1490, 1492-1530, 1537-1549, 1559-1576, 1611-1616, 1624-1641 and 1-414, 443-614, 997-1392 of Seq ID No 
169; 14-42, 70-75, 90-100, 158-181 and 1-164 of Seq ID No 170; 4-21, 30-36, 54-82, 89-97, 105-118, 138-147 
and 126-207 of Seq ID No 171; 4-21, 31-66, 96-104, 106-113, 131-142 and 180-204 of Seq ID No 172; 5-23, 
31-36, 38-55, 65-74, 79-88, 101-129, 131-154, 156-165, 183-194, 225-237, 245-261, 264-271, 279-284, 287-297, 
313-319, 327-336, 343-363, 380-386 and 11-197, 204-219, 258-372 of Seq ID No 173; 4-20, 34-41, 71-86, 100- 
110, 113-124, 133-143, 150-158, 160-166, 175-182, 191-197, 213-223, 233-239, 259-278, 298-322 and 195-289 of 
Seq ID No 174; 4-10, 21-35, 44-52, 54-62, 67-73, 87-103, 106-135, 161-174, 177-192, 200-209, 216-223, 249- 
298, 304-312, 315-329 and 12-130 of Seq ID No 175; 10-27, 33-38, 48-55, 70-76, 96-107, 119-133, 141-147, 
151-165, 183-190, 197-210, 228-236, 245-250, 266-272, 289-295, 297-306, 308-315, 323-352, 357-371, 381-390, 
394-401, 404-415, 417-425, 427-462, 466-483, 485-496, 502-507, 520-529, 531-541, 553-570, 577-588, 591-596, 
600-610, 619-632, 642-665, 671-692, 694-707 and 434-444 of Seq ID No 176; 6-14, 16-25, 36-46, 52-70, 83-111, 
129-138, 140-149, 153-166, 169-181, 188-206, 212-220, 223-259, 261-269, 274-282, 286-293, 297-306, 313-319, 
329-341, 343-359, 377-390, 409-415, 425-430 and 360-375 of Seq ID No 177; 4-26, 28-48, 54-62, 88-121, 147- 



wo 2004/078907 



PCT/EP2004/002087 



-25- 

162, 164-201, 203-237, 245-251 and 254-260 o£ Seq ID No 178; 12-21, 26-32, 66-72, 87-93, 98-112, 125-149, 
179-203, 209-226, 233-242, 249-261, 266-271, 273-289, 293-318, 346-354, 360-371, 391-400 and 369-382 of Seq 
ID No 179; 11-38, 44-65, 70-87, 129-135, 140-163, 171-177, 225-232, 238-249, 258-266, 271-280, 284-291, 295- 
300, 329-337, 344-352, 405-412, 416-424, 426-434, 436-455, 462-475, 478-487 and 270-312 of Seq ID No 180; 
5-17, 34-45, 59-69, 82-88, 117-129, 137-142, 158-165, 180-195, 201-206, 219-226, 241-260, 269-279, 292-305, 
312-321, 341-347, 362-381, 396-410, 413-432, 434-445, 447-453, 482-487, 492-499, 507-516, 546-552, 556-565, 
587-604 and 486-598 of Seq ID No 181; 4-15, 17-32, 40-47, 67-78, 90-98, 101-107, 111-136, 161-171, 184-198, 
208-214, 234-245, 247-254, 272-279, 288-298, 303-310, 315-320, 327-333, 338-349, 364-374 and 378-396 of Seq 
ID No 182; 5-27, 33-49, 51-57, 74-81, 95-107, 130-137, 148-157, 173-184 and 75-235 of Seq ID No 183; 6-23, 
47-53, 57-63, 75-82, 97-105, 113-122, 124-134, 142-153, 159-164, 169-179, 181-187, 192-208, 215-243, 247-257, 
285-290, 303-310 and 30-51 of Seq ID No 184; 17-29, 44-52, 59-73, 77-83, 86-92, 97-110, 118-153, 156-166, 
173-179, 192-209, 225-231, 234-240, 245-251, 260-268, 274-279, 297-306, 328-340, 353-360, 369-382, 384-397, 
414-423, 431-436, 452-465, 492-498, 500-508, 516-552, 554-560, 568-574, 580-586, 609-617, 620-626, 641-647 
and 208-219 of Seq ID No 185; 4-26, 32-45, 58-72, 111-119, 137-143, 146-159, 187-193, 221-231, 235-242, 250- 
273, 290-304, 311-321, 326-339, 341-347, 354-368, 397-403, 412-419, 426-432, 487-506, 580-592, 619-628, 663- 
685, 707-716, 743-751, 770-776, 787-792, 850-859, 866-873, 882-888, 922-931, 957-963, 975-981, 983-989, 1000- 
1008, 1023-1029, 1058-1064, 1089-1099, 1107-1114, 1139-1145, 1147-1156, 1217-1226, 1276-1281, 1329-1335, 
1355-1366, 1382-1394, 1410-1416, 1418-1424, 1443-1451, 1461-1469, 1483-1489, 1491-1501, 1515-1522, 1538- 
1544, 1549-1561, 1587-1593, 1603-1613, 1625-1630, 1636-1641, 1684-1690, 1706-1723, 1765-1771, 1787-1804, 
1850-1857, 1863-1894, 1897-1910, 1926-1935, 1937-1943, 1960-1983, 1991-2005, 2008-2014, 2018-2039 and 
396-533, 1342-1502, 1672-1920 of Seq ID No 186; 4-25, 45-50, 53-65, 79-85, 87-92, 99-109, 126-137, 141-148, 

156- 183, 190-203, 212-217, 221-228, 235-242, 247-277, 287-293, 300-319, 321-330, 341-361, 378-389, 394-406, 
437-449, 455-461, 472-478, 482-491, 507-522, 544-554, 576-582, 587-593, 611-621, 626-632, 649-661, 679-685, 
696-704, 706-716, 726-736, 740-751, 759-766, 786-792, 797-802, 810-822, 824-832, 843-852, 863-869, 874-879, 
882-905 and 1-113, 210-232, 250-423, 536-564 of Seq ID No 187; 4-16, 33-39, 43-49, 54-85, 107-123, 131-147, 

157- 169, 177-187, 198-209, 220-230, 238-248, 277-286, 293-301, 303-315, 319-379, 383-393, 402-414, 426-432, 
439-449, 470-478, 483-497, 502-535, 552-566, 571-582, 596-601, 608-620, 631-643, 651-656, 663-678, 680-699, 
705-717, 724-732, 738-748, 756-763, 766-772, 776-791, 796-810, 819-827, 829-841, 847-861, 866-871, 876-882, 
887-894, 909-934, 941-947, 957-969, 986-994, 998-1028, 1033-1070, 1073-1080, 1090-1096, 1098-1132, 1134- 
1159, 1164-1172, 1174-1201 and 617-635 of Seq ID No 188; 7-25, 30-40, 42-64, 70-77, 85-118, 120-166, 169- 
199, 202-213, 222-244 and 190-203 of Seq ID No 189; 4-11, 15-53, 55-93, 95-113, 120-159, 164-200, 210-243, 
250-258, 261-283, 298-319, 327-340, 356-366, 369-376, 380-386, 394-406, 409-421, 425-435, 442-454, 461-472, 
480-490, 494-505, 507-514, 521-527, 533-544, 566-574 and 385-398 of Seq ID No 190; 5-36, 66-72, 120-127, 
146-152, 159-168, 172-184, 205-210, 221-232, 234-243, 251-275, 295-305, 325-332, 367-373, 470-479, 482-487, 
520-548, 592-600, 605-615, 627-642, 655-662, 664-698, 718-725, 734-763, 776-784, 798-809, 811-842, 845-852, 
867-872, 879-888, 900-928, 933-940, 972-977, 982-1003 and 12-190, 276-283, 666-806 of Seq ID No 191; 4-38, 
63-68, 100-114, 160-173, 183-192, 195-210, 212-219, 221-238, 240-256, 258-266, 274-290, 301-311, 313-319, 
332-341, 357-363, 395-401, 405-410, 420-426, 435-450, 453-461, 468-475, 491-498, 510-518, 529-537, 545-552, 
585-592, 602-611, 634-639, 650-664 and 30-80, 89-105, 111-151 of Seq ID No 192; 7-29, 31-39, 47-54, 63-74, 
81-94, 97-117, 122-127, 146-157, 168-192, 195-204, 216-240, 251-259 and 195-203 of Seq ID No 193; 5-16, 28- 
34, 46-65, 79-94, 98-105, 107-113, 120-134, 147-158, 163-172, 180-186, 226-233, 237-251, 253-259, 275-285, 
287-294, 302-308, 315-321, 334-344, 360-371, 399-412, 420-426 and 32-50 of Seq ID No 194; 8-20, 30-36, 71- 
79, 90-96, 106-117, 125-138, 141-147, 166-174 and 75-90 of Seq ID No 195; 4-13, 15-33, 43-52, 63-85, 98-114, 
131-139, 146-174, 186-192, 198-206, 227-233 and 69-88 of Seq ID No 196; 4-22, 29-35, 59-68, 153-170, 213- 
219, 224-238, 240-246, 263-270, 285-292, 301-321, 327-346, 356-371, 389-405, 411-418, 421-427, 430-437, 450- 
467, 472-477, 482-487, 513-518, 531-538, 569-576, 606-614, 637-657, 662-667, 673-690, 743-753, 760-767, 770- 
777, 786-802 and 96-230, 361-491, 572-585 of Seq ID No 197; 4-12, 21-36, 48-55, 74-82, 121-127, 195-203, 
207-228, 247-262, 269-278, 280-289 and 102-210 of Seq ID No 198; 13-20, 23-31, 38-44, 78-107, 110-118, 122- 
144, 151-164, 176-182, 190-198, 209-216, 219-243, 251-256, 289-304, 306-313 and 240-248 of Seq ID No 199; 
5-26, 34-48, 57-77, 84-102, 116-132, 139-145, 150-162, 165-173, 176-187, 192-205, 216-221, 234-248, 250-260 
and 182-198 of Seq ID No 200; 10-19, 26-44, 53-62, 69-87, 90-96, 121-127, 141-146, 148-158, 175-193, 204- 
259, 307-313, 334-348, 360-365, 370-401, 411-439, 441-450, 455-462, 467-472, 488-504 and 41-56 of Seq ID No 
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201; 5-21, 36-42, 96-116, 123-130, 138-144, 146-157, 184-201, 213-228, 252-259, 277-297, 308-313, 318-323, 
327-333 and 202-217 of Seq ID No 202; 6-26, 33-51, 72-90, 97-131, 147-154, 164-171, 187-216, 231-236, 260- 
269, 275-283 and 1-127 of Seq ID No 203; 4-22, 24-38, 44-58, 72-88, 99-108, 110-117, 123-129, 131-137, 142- 
147, 167-178, 181-190, 206-214, 217-223, 271-282, 290-305, 320-327, 329-336, 343-352, 354-364, 396-402, 425- 
434, 451-456, 471-477, 485-491, 515-541, 544-583, 595-609, 611-626, 644-656, 660-681, 683-691, 695-718 and 
297-458 of Seq ID No 204; 5-43, 92-102, 107-116, 120-130, 137-144, 155-163, 169-174, 193-213 and 24-135 of 
Seq ID Mo 205; 4-25, 61-69, 73-85, 88-95, 97-109, 111-130, 135-147, 150-157, 159-179, 182-201, 206-212, 224- 
248, 253-260, 287-295, 314-331, 338-344, 365-376, 396-405, 413-422, 424-430, 432-449, 478-485, 487-494, 503- 
517, 522-536, 544-560, 564.-578, 585-590, 597-613, 615-623, 629-636, 640-649, 662-671, 713-721 and 176-330 of 
Seq ID No 206; 31-37, 41-52, 58-79, 82-105, 133-179, 184-193, 199-205, 209-226, 256-277, 281-295, 297-314, 
322-328, 331-337, 359-367, 379-395, 403-409, 417-432, 442-447, 451-460, 466-472 and 46-62, 296-341 of Seq 
ID No 207; 23-29, 56-63, 67-74, 96-108, 122-132, 139-146, 152-159, 167-178, 189-196, 214-231, 247-265, 274- 
293, 301-309, 326-332, 356-363, 378-395, 406-412, 436-442, 445-451, 465-479, 487-501, 528-555, 567-581, 583- 
599, 610-617, 622-629, 638-662, 681-686, 694-700, 711-716 and 667-684 of Seq ID No 208; 20-51, 53-59, 109- 
115, 140-154, 185-191, 201-209, 212-218, 234-243, 253-263, 277-290, 303-313, 327-337, 342-349, 374-382, 394- 
410, 436-442, 464-477, 486-499, 521-530, 536-550, 560-566, 569-583, 652-672, 680-686, 698-704, 718-746, 758- 
770, 774-788, 802-827, 835-842, 861-869 and 258-416 of Seq ID No 209; 7-25, 39-45, 59-70, 92-108, 116-127, 
161-168, 202-211, 217-227, 229-239, 254-262, 271-278, 291-300 and 278-295 of Seq ID No 210; 4-20, 27-33, 
45-51, 53-62, 66-74, 81-88, 98-111, 124-130, 136-144, 156-179, 183-191 and 183-195 of Seq ID No 211; 12-24, 
27-33, 43-49, 55-71, 77-85, 122-131, 168-177, 179-203, 209-214, 226-241 and 63-238 of Seq ID No 212; 4-19, 
37-50, 120-126, 131-137, 139-162, 177-195, 200-209, 211-218, 233-256, 260-268, 271-283, 288-308 and 1-141 of 
Seq ID No 213; 11-17, 40-47, 57-63, 96-124, 141-162, 170-207, 223-235, 241-265, 271-277, 281-300, 312-318, 
327-333, 373-379 and 231-368 of Seq ID No 214; 9-33, 41-48, 57-79, 97-103, 113-138, 146-157, 165-186, 195- 
201, 209-215, 223-229, 237-247, 277-286, 290-297, 328-342 and 247-260 of Seq ID No 215; 7-15, 39-45, 58-64, 
79-84, 97-127, 130-141, 163-176, 195-203, 216-225, 235-247, 254-264, 271-279 and 64-72 of Seq ID No 216; 4- 
12, 26-42, 46-65, 73-80, 82-94, 116-125, 135-146, 167-173, 183-190, 232-271, 274-282, 300-306, 320-343, 351- 
362, 373-383, 385-391, 402-409, 414-426, 434-455, 460-466, 473-481, 485-503, 519-525, 533-542, 554-565, 599- 
624, 645-651, 675-693, 717-725, 751-758, 767''785, 792-797, 801-809, 819-825, 831-836, 859-869, 890-897 and 
222-362, 756-896 of Seq ID No 217; 11-17, 22-28, 52-69, 73-83, 86-97, 123-148, 150-164, 166-177, 179-186, 
188-199, 219-225, 229-243, 250-255 and 153-170 of Seq ID No 218; 4-61, 71-80, 83-90, 92-128, 133-153, 167- 
182, 184-192, 198-212 and 56-73 of Seq ID No 219; 4-19, 26-37, 45-52, 58-66, 71-77, 84-92, 94-101, 107-118, 
120-133, 156-168, 170-179, 208-216, 228-238, 253-273, 280-296, 303-317, 326-334 and 298-312 of Seq ID No 
220; 7-13, 27-35, 38-56, 85-108, 113-121, 123-160, 163-169, 172-183, 188-200, 206-211, 219-238, 247-254 and 
141-157 of Seq ID No 221; 23-39, 45-73, 86-103, 107-115, 125-132, 137-146, 148-158, 160-168, 172-179, 185- 
192, 200-207, 210-224, 233-239, 246-255, 285-334, 338-352, 355-379, 383-389, 408-417, 423-429, 446-456, 460- 
473, 478-503, 522-540, 553-562, 568-577, 596-602, 620-636, 640-649, 655-663 and 433-440, 572-593 of Seq ID 
No 222; 4-42, 46-58, 64-76, 118-124, 130-137, 148-156, 164-169, 175-182, 187-194, 203-218, 220-227, 241-246, 
254-259, 264-270, 275-289, 296-305, 309-314, 322-334, 342-354, 398-405, 419-426, 432-443, 462-475, 522-530, 
552-567, 593-607, 618-634, 636-647, 653-658, 662-670, 681-695, 698-707, 709-720, 732-742, 767-792, 794-822, 
828-842, 851-866, 881-890, 895-903, 928-934, 940-963, 978-986, 1003-1025, 1027-1043, 1058-1075, 1080-1087, 
1095-1109, 1116-1122, 1133-1138, 1168-1174, 1179-1186, 1207-1214, 1248-1267 and 17-319, 417-563 of Seq ID 
No 223; 6-19, 23-33, 129-138, 140-150, 153-184, 190-198, 206-219, 235-245, 267-275, 284-289, 303-310, 322- 
328, 354-404, 407-413, 423-446, 453-462, 467-481, 491-500 and 46-187 of Seq ID No 224; 4-34, 39-57, 78-86, 
106-116, 141-151, 156-162, 165-172, 213-237, 252-260, 262-268, 272-279, 296-307, 332-338, 397-403, 406-416, 
431-446, 448-453, 464-470, 503-515, 519-525, 534-540, 551-563, 578-593, 646-668, 693-699, 703-719, 738-744, 
748-759, 771-777, 807-813, 840-847, 870-876, 897-903, 910-925, 967-976, 979-992 and 21-244, 381-499, 818-959 
of Seq ID No 225; 19-29, 65-75, 90-109, 111-137, 155-165, 169-175 and 118-136 of Seq ID No 226; 15-20, 30- 
36, 55-63, 73-79, 90-117, 120-127, 136-149, 166-188, 195-203, 211-223, 242-255, 264-269, 281-287, 325-330, 
334-341, 348-366, 395-408, 423-429, 436-444, 452-465 and 147-155 of Seq ID No 227; 11-18, 21-53, 77-83, 91- 
98, 109-119, 142-163, 173-181, 193-208, 216-227, 238-255, 261-268, 274-286, 290-297, 308-315, 326-332, 352- 
359, 377-395, 399-406, 418-426, 428-438, 442-448, 458-465, 473-482, 488-499, 514-524, 543-553, 564-600, 623- 
632, 647-654, 660-669, 672-678, 710-723, 739-749, 787-793, 820.-828, 838-860, 889-895, 901-907, 924-939, 956- 



wo 2004/078907 



PCT/EP2004/002087 



-27- 

962, 969-976, 991-999, 1012-1018, 1024-1029, 1035-1072, 1078-1091, 1142-1161 and 74-438 of Seq ID No 228; 
4-31, 41-52, 58-63, 65-73, 83-88, 102-117, 123-130, 150-172, 177-195, 207-217, 222-235, 247-253, 295-305, 315- 
328, 335-342, 359-365, 389-394, 404-413 and 156-420 of Seq ID No 229; 4-42, 56-69, 98-108, 120-125, 210-216, 
225-231, 276-285, 304-310, 313-318, 322-343 and 79-348 of Seq ID No 230; 12-21, 24-30, 42-50, 61-67, 69-85, 
90-97, 110-143, 155-168 and 53-70 of Seq ID No 231; 4-26, 41-54, 71-78, 88-96, 116-127, 140-149, 151-158, 
161-175, 190-196, 201-208, 220-226, 240-247, 266-281, 298-305, 308-318, 321-329, 344-353, 370-378, 384-405, 
418-426, 429-442, 457-463, 494-505, 514-522 and 183-341 of Seq ID No 232; 4-27, 69-77, 79-101, 117-123, 
126-142, 155-161, 171-186, 200-206, 213-231, 233-244, 258-263, 269-275, 315-331, 337-346, 349-372, 376-381, 
401-410, 424-445, 447-455, 463-470, 478-484, 520-536, 546-555, 558-56% 580-597, 603-618, 628-638, 648-660, 
668-683, 717-723, 765-771, 781-788, 792-806, 812-822 and 92-231, 618-757 of Seq ID No 233; 11-47, 63-75, 
108-117, 119-128, 133-143, 171-185, 190-196, 226-232, 257-264, 278-283, 297-309, 332-338, 341-346, 351-358, 
362-372 and 41-170 of Seq ID No 234; 6-26, 50-56, 83-89, 108-114, 123-131, 172-181, 194-200, 221-238, 241- 
259, 263-271, 284-292, 304-319, 321-335, 353-358, 384-391, 408-417, 424-430, 442-448, 459-466, 487-500, 514- 
528, 541-556, 572-578, 595-601, 605-613, 620-631, 634-648, 660-679, 686-693, 702-708, 716-725, 730-735, 749- 
755, 770-777, 805-811, 831-837, 843-851, 854-860, 863-869, 895-901, 904-914, 922-929, 933-938, 947-952, 956- 

963, 1000-1005, 1008-1014, 1021-1030, 1131-1137, 1154-1164, 1166-1174 and 20-487, 757-1153 of Seq ID No 
235; 10-34, 67-78, 131-146, 160-175, 189-194, 201-214, 239-250, 265-271, 296-305 and 26-74, 91-100, 105-303 
of Seq ID No 236; 9-15, 19-32, 109-122, 143-150, 171-180, 186-191, 209-217, 223-229, 260-273, 302-315, 340- 
346, 353-359, 377-383, 389-406, 420-426, 460-480 and 10-223, 231-251, 264-297, 312-336 of Seq ID No 237; 5- 
28, 76-81, 180-195, 203-209, 211-219, 227-234, 242-252, 271-282, 317-325, 350-356, 358-364, 394-400, 405-413, 
417-424, 430-436, 443-449, 462-482, 488-498, 503-509, 525-537 and 22-344 of Seq ID No 238; 5-28, 42-54, 77- 
83, 86-93, 98-104, 120-127, 145-159, 166-176, 181-187, 189-197, 213-218, 230-237, 263-271, 285-291, 299-305, 
326-346, 368-375, 390-395 and 1-151 of Seq ID No 239; 6-34, 48-55, 58-64, 84-101, 121-127, 143-149, 153-159, 
163-170, 173-181, 216-225, 227-240, 248-254, 275-290, 349-364, 375-410, 412-418, 432-438, 445-451, 465-475, 
488-496, 505-515, 558-564, 571-579, 5S5''595, 604-613, 626-643, 652-659, 677-686, 688-696, 702-709, 731-747, 
777-795, 820-828, 836-842, 845-856, 863-868, 874-882, 900-909, 926-943, 961-976, 980-986, 992-998, 1022-1034, 
1044-1074, 1085-1096, 1101-1112, 1117-1123, 1130-1147, 1181-1187, 1204-1211, 1213-1223, 1226-1239, 1242- 
1249, 1265-1271, 1273-1293, 1300-1308, 1361-1367, 1378-1384, 1395-1406, 1420-1428, 1439-1446, 1454-1460, 
1477-1487, 1509-1520, 1526-1536, 1557-1574, 1585-1596, 1605-1617, 1621-1627, 1631-1637, 1648-1654, 1675- 
1689, 1692-1698, 1700-1706, 1712-1719, 1743-1756 and 91-263 of Seq ID No 240; 4-16, 75-90, 101-136, 138- 
144, 158-164, 171-177, 191-201, 214-222, 231-241, 284-290, 297-305, 311-321, 330-339, 352-369, 378-385, 403- 
412, 414-422, 428-435, 457-473, 503-521, 546-554, 561.-568, 571-582, 589-594, 600-608, 626-635, 652-669, 687- 
702, 706-712, 718-724, 748-760, 77^-775 and 261-272 of Seq ID No 241; 4-19, 30-41, 46-57, 62-68, 75-92, 126- 
132, 149-156, 158-168, 171-184, 187-194, 210-216, 218-238, 245-253, 306-312, 323-329, 340-351, 365-373, 384- 
391, 399-405, 422-432, 454-465, 471-481, 502-519, 530-541, 550-562, 566-57^, 576-582, 593-599, 620-634, 637- 
643, 645-651, 657-664, 688-701 and 541-551 of Seq ID No 242; 6-11, 17-25, 53-58, 80-86, 91-99, 101-113, 123- 
131, 162-169, 181-188, 199-231, 245-252 and 84-254 of Seq ID No 243; 13-30, 71-120, 125-137, 139-145, 184- 
199 and 61-78 of Seq ID No 244; 9-30, 38-53, 63-70, 74-97, 103-150, 158-175, 183-217, 225-253, 260-268, 272- 
286, 290-341, 352-428, 434-450, 453-460, 469-478, 513-525, 527-534, 554-563, 586-600, 602-610, 624-640, 656- 
684, 707-729, 735-749, 757-763, 766-772, 779-788, 799-805, 807-815, 819-826, 831-855 and 568-580 of Seq ID 
No 245; 11-21, 29-38 and 5-17 of Seq ID No 246; 2-9 of Seq ID No 247; 4-10, 16-28 and 7-18, 26-34 of Seq 
ID No 248; 10-16 and 1-15 of Seq ID No 249; 4-11 of Seq ID No 250; 4-40, 42-51 and 37-53 of Seq ID No 
251; 4-21 and 22-29 of Seq ID No 252; 2-11 Seq ID No 253; 9-17, 32-44 and 1-22 of Seq ID No 254; 19-25, 
27-32 and 15-34 of Seq ID No 255; 4-12, 15-22 and 11-33 of Seq ID No 256; 10-17, 24-30, 39-46, 51-70 and 
51-61 of Seq ID No 257; 6-19 of Seq ID No 258; 6-11, 21-27, 31-54 and 11-29 of Seq ID No 259; 4-10, 13-45 
and 11-35 of Seq ID No 260; 4-14, 23-32 and 11-35 of Seq ID No 261; 14-39, 45-51 and 15-29 of Seq ID No 
262; 4-11, 14-28 and 4-17 of Seq ID No 263; 4-16 and 2-16 of Seq ID No 264; 4-10, 12-19, 39-50 and 6-22 of 
Seq ID No 265; 2-13 of Seq ID No 266; 4-11, 22-65 and 3-19 of Seq ID No 267; 17-23, 30-35, 39-46, 57-62 
and 30-49 of Seq ID No 268; 4-19 and 14-22 of Seq ID No 269; 2-9 of Seq ID No 270; 7-18, 30-43 and 4-12 
of Seq ID No 271; 4-30, 39-47 and 5-22 of Seq ID No 272; 6-15 and 14-29 of Seq ID No 273; 4-34 and 23-35 
of Seq ID No 274; 4-36, 44-57, 65-72 and 14-27 of Seq ID No 275; 4-18 and 11-20 of Seq ID No 276; 5-19 of 
Seq ID No 277; 18-36 and 6-20 of Seq ID No 278; 4-10, 19-34, 41-84, 96-104 and 50-63 of Seq ID No 279; 4- 
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9, 19-27 and 8-21 of Seq ID No 280; 4-16, 18-28 and 22-30 of Seq ID No 281; 4-15 and 21-35 of Seq ID No 
282; 4-17 and 3-13 of Seq ID No 283; 4-12 and 4-18 of Seq ID No 284; 4-24, 31-36 and 29-45 of Seq ID No 
285; 12-22, 34-49 and 21-32 of Seq ID No 286; 4-17 and 22-32 of Seq ID No 287; 4-16, 25-42 and 7-28 of Seq 
ID No 288; 4-10 and 7-20 of Seq ID No 289; 4-11, 16-36, 39-54 and 28-44 of Seq ID No 290; 5-20, 29-54 and 
14-29 of Seq ID No 291; 24-33 and 10-22 of Seq ID Mo 292; 10-51, 54-61 and 43-64 of Seq ID No 293; 7-13 
and 2-17 of Seq ID Mo 294; 11-20 and 6-20 of Seq ID No 295; 4-30, 34-41 and 19-28 of Seq ID No 296; 11- 
21 of Seq ID No 297; 4-16, 21-26 and 9-38 of Seq ID No 298; 4-12, 15-27, 30-42, 66-72 and 10-24 of Seq ID 
Mo 299; 8-17 and 11-20 of Seq ID Mo 300; and 2-19 of Seq ID Mo246; 1-12 of Seq ID Mo 247; 21-38 of Seq 
ID Mo 248; 2-22 of Seq ID Mo 254; 15-33 of Seq ID Mo 255; 11-32 of Seq ID Mo 256; 11-28 of Seq ID Mo 
259; 10-27 of Seq ID Mo 260; 9-26 of Seq ID Mo 261; 4-16 of Seq ID Mo 263; 1-18 of Seq ID Mo 266; 12-29 
of Seq ID No 273; 6-23 of Seq ID No 276; 1-21 of Seq ID No 277; 47-64 of Seq ID No 279; 28-45 of Seq ID 
Mo 285; 18-35 of Seq ID No 287; 14-31 of Seq ID Mo 291; 7-24 of Seq ID Mo 292; 8-25 of Seq ID Mo 299; 1- 
20 of Seq ID Mo 300; 18-33 of Seq ID Mo 151; 62-72 of Seq ID Mo 151; 118-131 of Seq ID Mo 152; 195-220 
of Seq ID No 154; 215-240 of Seq ID No 154; 255-280 of Seq ID No 154, 72-81 of Seq ID No 155; 174-186 
of Seq ID No 156; 317-331 of Seq ID Mo 157; 35-59 of Seq ID Mo 158; 54-84 of Seq ID Mo 158; 79-104 of 
Seq ID No 158; 33-58 of Seq ID No 159; 81-101 of Seq ID Mo 159; 136-150 of Seq ID Mo 159; 173-186 of 
Seq ID No 159; 231-251 of Seq ID No 159; 22-48 of Seq ID No 161; 24-39 of Seq ID No 162; 475-489 of 
Seq ID No 163; 38-56 of Seq ID No 164; 583-604 of Seq ID No 164; 202-223 of Seq ID No 165; 222-247 of 
Seq ID No 165; 242-267 of Seq ID No 165; 262-287 of Seq ID No 165; 282-307 of Seq ID No 165; 302-327 
of Seq ID No 165; 25-48 of Seq ID No 166; 204-217 of Seq ID No 167; 259-276 of Seq ID No 168; 121-139 
of Seq ID No 169; 260-267 of Seq ID No 169; 215-240 of Seq ID No 169; 115-140 of Seq ID No 170; 182- 
204 of Seq ID No 172; 144-153 of Seq ID No 173; 205-219 of Seq ID No 173; 196-206 of Seq ID No 174; 
240-249 of Seq ID No 174; 272-287 of Seq ID No 174; 199-223 of Seq ID No 174; 218-237 of Seq ID No 
174; 226-249 of Seq ID No 175; 287-306 of Seq ID No 175; 430-449 of Seq ID No 176; 361-375 of Seq ID 
No 177; 241-260 of Seq ID No 178; 483-502 of Seq ID No 181; 379-396 of Seq ID No 182; 31-51 of Seq ID 
No 184; 1436-1460 of Seq ID No 186; 1455-1474 of Seq ID No 186; 1469-1487 of Seq ID No 186; 215-229 of 
Seq ID No 187; 534-561 of Seq ID No 187; 59-84 of Seq ID No 187; 79-104 of Seq ID No 187; 618-635 of 
Seq ID No 188; 191-203 of Seq ID No 189; 386-398 of Seq ID No 190; 65-83 of Seq ID No 191; 90-105 of 
Seq ID No 192; 112-136 of Seq ID No 192; 290-209 of Seq ID No 193; 33-50 of Seq ID No 194; 76-90 of 
Seq ID No 195; 70-88 of Seq ID No 196; 418-442 of Seq ID No 197; 574-585 of Seq ID No 197; 87-104 of 
Seq ID No 198; 124-148 of Seq ID No 198; 141-152 of Seq ID No 198; 241-248 of Seq ID No 199; 183-198 
of Seq ID No 200; 40-57 of Seq ID No 201; 202-217 of Seq ID No 202; 50-74 of Seq ID No 203; 69-93 of 
Seq ID No 203; 88-112 of Seq ID No 203; 107-127 of Seq ID No 203; 74-92 of Seq ID No 205; 207-232 of 
Seq ID No 206; 227-252 of Seq ID No 206; 247-272 of Seq ID No 206; 47-60 of Seq ID No 207; 297-305 of 
Seq ID No 207; 312-337 of Seq ID No 207; 667-384 of Seq ID No 208; 279-295 of Seq ID No 210; 179-198 
of Seq ID No 211; 27-51 of Seq ID No 213; 46-70 of Seq ID No 213; 65-89 of Seq ID No 213; 84-108 of Seq 
ID No 213; 112-141 of Seq ID No 213; 248-260 of Seq ID No 215; 59-78 of Seq ID No 216; 154-170 of Seq 
ID No 218; 57-73 of Seq ID No 219; 297-314 of Seq ID No 220; 142-157 of Seq ID No 221; 428-447 of Seq 
ID No 222; 573-593 of Seq ID No 222; 523-544 of Seq ID No 223; 46-70 of Seq ID No 223; 65-89 of Seq ID 
No 223; 84-108 of Seq ID No 223; 122-151 of Seq ID No 223; 123-142 of Seq ID No 224; 903-921 of Seq ID 
No 225; 119-136 of Seq ID No 226; 142-161 of Seq ID No 227; 258-277 of Seq ID No 228; 272-300 of Seq 
ID No 228; 295-322 of Seq ID No 228; 311-343 of Seq ID No 229; 278-304 of Seq ID No 229; 131-150 of 
Seq ID No 230; 195-218 of Seq ID No 230; 53-70 of Seq ID No 231; 184-208 of Seq ID No 232; 222-246 of 
Seq ID No 232; 241-265 of Seq ID No 232; 260-284 of Seq ID No 232; 279-303 of Seq ID No 232; 317-341 
of Seq ID No 232; 678-696 of Seq ID No 233; 88-114 of Seq ID No 235; 464-481 of Seq ID No 235; 153-172 
of Seq ID Mo 236; 137-155, 166-184 of Seq ID Mo 236; 215-228 of Seq ID Mo 236; 37-51 of Seq ID Mo 237; 
53-75 of Seq ID No 237; 232-251 of Seq ID No 237; 318-336 of Seq ID No 237; 305-315 of Seq ID No 238; 
131-156 of Seq ID No 238; 258-275 of Seq ID No 241; 107-137 of Seq ID No 243; 138-162 of Seq ID No 
243; 157-181 of Seq ID Mo 243; 195-227 of Seq ID Mo 243; 62-78 of Seq ID No 244; 567-584 of Seq ID No 
245, and fragments comprising at least 6, preferably more than 8, especially more than 10 aa of said 
sequences. All these fragments individually and each independently form a preferred selected aspect of 
the present invention. 
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All linear hyperimmune serum reactive fragments of a particular antigen may be identified by analysing 
the entire sequence of the protein antigen by a set of peptides overlapping by 1 amino acid with a length 
of at least 10 amino acids. Subsequently, non-linear epitopes can be identified by analysis of the protein 
antigen with hyperimmune sera using the expressed full-length protein or domain polypeptides thereof. 
Assuming that a distinct domain of a protein is sufficient to form the 3D structure independent from the 
native protein, the analysis of the respective recombinant or S3nithetically produced domain polypeptide 
with hyperimmune serum would allow the identification of conformational epitopes within the 
individual domains of multi-domain proteins. For those antigens where a domain possesses linear as well 
as conformational epitopes, competition experiments with peptides corresponding to the linear epitopes 
may be used to confirm the presence of conformational epitopes. 

It will be appreciated that the invention also relates to, among others, nucleic acid molecules encoding the 
aforementioned fragments, nucleic acid molecules that hybridise to nucleic acid molecules encoding the 
fragments, particularly those that hybridise under stringent conditions, and nucleic acid molecules, such 
as PGR primers, for amplifying nucleic acid molecules that encode the fragments. In these regards^ 
preferred nucleic acid molecules are those that correspond to the preferred fragments, as discussed 
above. 

The present invention also relates to vectors which comprise a nucleic acid molecule or nucleic acid 
molecules of the present invention, host cells which are genetically engineered with vectors of the 
invention and the production of hj^erimmune serum reactive antigens and fragments thereof by 
recombinant techniques. 

A great variety of expression vectors can be used to express a hyperimmune serum reactive antigen or 
fragment thereof according to the present invention. Generally, any vector suitable to maintain, 
propagate or express nucleic acids to express a polypeptide in a host may be used for expression in this 
regard. In accordance with this aspect of the invention the vector may be, for example, a plasmid vector, 
a single or double-stranded phage vector, a single or double-stranded RNA or DNA viral vector. Starting 
plasmids disclosed herein are either commercially available, publicly available, or can be constructed 
from available plasmids by routine application of well-known, published procedures. Preferred among 
vectors, in certain respects, are those for expression of nucleic acid molecules and hyperimmune serum 
reactive antigens or fragments thereof of the present invention. Nucleic acid constructs in host cells can 
be used in a conventional maimer to produce the gene product encoded by the recombinant sequence. 
Alternatively, the hyperimmune serum reactive antigens and fragments thereof of the invention can be 
S3mthetically produced by conventional peptide synthesizers. Mature proteins can be expressed in 
mammalian cells, yeast, bacteria, or other cells under the control of appropriate promoters. Gell-free 
translation systems can also be employed to produce such proteins using RNAs derived from the DNA 
construct of the present invention. 

Host cells can be genetically engineered to incorporate nucleic acid molecules and express nucleic acid 
molecules of the present invention. Representative examples of appropriate hosts include bacterial cells, 
such as streptococci, staphylococci, E. coli, Streptomyces and Bacillus siibtillis cells; fungal cells, such as 
yeast cells and Aspergillus cells; insect cells such as Drosophila S2 and Spodoptera Sf9 cells; animal cells 
such as CHO, COS, Hela, C127, 3T3, BHK, 293 and Bowes melanoma cells; and plant cells. 

The invention also provides a process for producing a S. pyogenes hyperimmune serum reactive antigen 
and a fragment thereof comprising expressing from the host cell a hyperimmune serum reactive antigen 
or fragment thereof encoded by the nucleic acid molecules provided by the present invention. The 
invention further provides a process for producing a cell, which expresses a S, pyogenes hyperimmune 
serum reactive antigen or a fragment thereof comprising transforming or transfecting a suitable host cell 
with the vector according to the present invention such that the transformed or transfected cell expresses 
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the polypeptide encoded by the nucleic acid contained in the vector. 

The polypeptide may be expressed in a modified form, such as a fusion protein, and may include not 
only secretion signals but also additional heterologous functional regions. Thus, for instance, a region of 
additional amino acids, particularly charged amino acids, may be added to the N- or C-terminus of the 
polypeptide to improve stability and persistence in the host cell, during purification or during 
subsequent handling and storage. Also, regions may be added to the polypeptide to facilitate 
purification. Such regions may be removed prior to final preparation of the polypeptide. The addition of 
peptide moieties to polypeptides to engender secretion or excretion, to improve stability or to facilitate 
purification, among others, are familiar and routine techniques in the art. A preferred fusion protein 
comprises a heterologous region from immunoglobulin that is useful to solubilize or purify pol3rpeptides. 
For example, EP-A-O 464 533 (Canadian counterpart 2045869) discloses fusion proteins comprising 
various portions of constant region of immunoglobin molecules together with another protein or part 
thereof. In drug discovery, for example, proteins have been fused with antibody Fc portions for the 
purpose of high-throughout screening assays to identify antagonists. See for example, {Bennett, D. et aL, 
1995} and {Johanson, K. et al., 1995}. 

The S. pyogenes hyperimmune serum reactive antigen or a fragment thereof can be recovered and purified 
from recombinant cell cultures by well-known methods including ammonium sulfate or ethanol 
precipitation, acid extraction, anion or cation exchange chromatography, phosphocellulose 
chromatography, hydrophobic interaction chromatography, hydroxylapatite chromatography and lectin 
chromatography. 

The hyperimmune serum reactive antigens and fragments thereof according to the present invention can 
be produced by chemical synthesis as well as by biotechnological means. The latter comprise the 
transfection or transformation of a host cell with a vector containing a nucleic acid according to the 
present invention and the cultivation of the transfected or transformed host cell under conditions which 
are known to the ones skilled in the art. The production method may also comprise a purification step in 
order to purify or isolate the polypeptide to be manufactured. In a preferred embodiment the vector is a 
vector according to the present invention. 

The h3rperimmune serum reactive antigens and fragments thereof according to the present invention may 
be used for the detection of the organism or organisms in a sample containing these organisms or 
polypeptides derived thereof. Preferably such detection is for diagnosis, more preferable for the diagnosis 
of a disease, most preferably for the diagnosis of a diseases related or linked to the presence or abundance 
of Gram-positive bacteria, especially bacteria selected from the group comprising streptococci, 
staphylococci and lactococci. More preferably, the microorganisms are selected from the group 
comprising Streptococcus agalactiae, Streptococcus pneumoniae and Streptococcus mutans, especially the 
microorganism is Streptococcus pyogenes. 

Tlie present invention also relates to diagnostic assays such as quantitative and diagnostic assays for 
detecting levels of the hyperimmxme serum reactive antigens and fragments thereof of the present 
invention in cells and tissues, including determination of normal and abnormal levels. Thus, for instance, 
a diagnostic assay in accordance with the invention for detecting over-expression of the polypeptide 
compared to normal control tissue samples may be used to detect the presence of an infection, for 
example, and to identify the infecting organism. Assay techniques that can be used to determine levels of 
a polypeptide, in a sample derived from a host are well-known to those of skill in the art. Such assay 
methods include radioimmunoassays, competitive-binding assays. Western Blot analysis and ELISA 
assays. Among these, ELISAs frequently are preferred. An ELISA assay initially comprises preparing an 
antibody specific to the polypeptide, preferably a monoclonal antibody. In addition, a reporter antibody 
generally is prepared which binds to the monoclonal antibody. The reporter antibody is attached to a 
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detectable reagent such as radioactive, fluorescent or enzymatic reagent, such as horseradish peroxidase 
enzyme. 

The hyperimmune serum reactive antigens and fragments thereof according to the present invention may 

also be used for the purpose of or in connection with an array. More particularly, at least one of the 
hyperimmune serum reactive antigens and fragments thereof according to the present invention may be 
immobilized on a support. Said support typically comprises a variety of hyperimmune serum reactive 
antigens and fragments thereof whereby the variety may be created by using one or several of the 
hyperimmune serum reactive antigens and fragments thereof according to the present invention and/or 
hyperimmune serum reactive antigens and fragments thereof being different. The characterizing featoire 
of such array as well as of any array in general is the fact that at a distinct or predefined region or 
position on said support or a surface thereof, a distinct polypeptide is immobilized. Because of this any 
activity at a distinct position or region of an array can be correlated with a specific polypeptide. The 
number of different hyperimmune serum reactive antigens and fragments thereof immobilized on a 
support may range from as little as 10 to several 1000 different hyperimmune serum reactive antigens 
and fragments thereof. The density of hyperimmune serum reactive antigens and fragments thereof per 
cm^ is in a preferred embodiment as little as 10 peptides/polypeptides per cm^ to at least 400 different 
peptides/polypeptides per cm^ and more particularly at least 1000 different h5^erimmune serum reactive 
antigens and fragments thereof per cm^. 

The manufacture of such arrays is known to the one skilled in the art and, for example, described in US 
patent 5,744,309. The array preferably comprises a planar, porous or non-porous solid support having at 
least a first surface. The hyperimmune serum reactive antigens and fragments thereof as disclosed herein, 
are immobilized on said surface. Preferred support materials are, among others, glass or cellulose. It is 
also within the present invention that the array is used for any of the diagnostic applications described 
herein. Apart from the hyperimmune serum reactive antigens and fragments thereof according to the 
present invention also the nucleic acid molecules according to the present invention may be used for the 
generation of an array as described above. This applies as well to an array made of antibodies, preferably 
monoclonal antibodies as, among others, described herein. 

In a further aspect the present invention relates to an antibody directed to any of the hyperimmune 
serum reactive antigens and fragments thereof, derivatives or fragments thereof according to the present 
invention. The present invention includes, for example, monoclonal and polyclonal antibodies, chimeric, 
single chain, and humanized antibodies, as well as Fab fragments, or the product of a Fab expression 
library. It is within the present invention that the antibody may be chimeric, i. e. that different parts 
thereof stem from different species or at least the respective sequences are taken from different species. 

Antibodies generated against the hyperimmime serum reactive antigens and fragments thereof 

corresponding to a sequence of the present invention can be obtained by direct injection of the 
hyperimmune serum reactive antigens and fragments thereof into an animal or by administering the 
hyperimmune serum reactive antigens and fragments thereof to an animal, preferably a non-human. The 
antibody so obtained will then bind the hyperimmune serum reactive antigens and fragments thereof 
itself. In this manner, even a sequence encoding onl}^ a fragment of a hyperimmune serum reactive 
antigen and fragments thereof can be used to generate antibodies binding the whole native hyperimmune 
serum reactive antigen and fragments thereof. Such antibodies can then be used to isolate the 
hyperimmune serum reactive antigens and fragments thereof from tissue expressing those hyperimmune 
serum reactive antigens and fragments thereof. 

For preparation of monoclonal antibodies, any teclmique known in the art which provides antibodies 
produced by continuous cell line cultures can be used, (as described originally in {Kohler, G. et al., 1975}. 

Techniques described for the production of single chain antibodies (U.S. Patent No. 4,946,778) can be 
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adapted to produce single chain antibodies to immunogenic hyperimmime serum reactive antigens and 

fragments thereof according to this invention. Also, transgenic mice, or other organisms such as other 
mammals, may be used to express humanized antibodies to immunogenic hyperimmime serum reactive 
antigens and fragments thereof according to this invention. 

Alternatively, phage display technology or ribosomal display could be utilized to select antibody genes 
with binding activities towards the hyperimmune serum reactive antigens and fragments thereof either 
from repertoires of PCR amplified v-genes of lymphoc3nes from humans screened for possessing 
respective target antigens or from naive libraries {McCafferty, J. et al., 1990}; {Marks, J, et aL, 1992}. The 
affinity of these antibodies can also be improved by chain shuffling {Clackson, T, et aL, 1991}. 

If two antigen binding domains are present, each domain may be directed against a different epitope — 
termed 'bispecific' antibodies. 

The above-described antibodies may be employed to isolate or to identify clones expressing the 
h5rperimmune serum reactive antigens and fragments thereof or purify the hyperimmune serum reactive 
antigens and fragments thereof of the present invention by attachment of the antibody to a solid support 
for isolation and/or purification by affinity chromatography. 

Thus, among others, antibodies against the hyperimmune serum reactive antigens and fragments thereof 
of the present invention may be employed to inhibit and/or treat infections, particularly bacterial 
infections and especially infections arising from S. pyogenes. 

Hyperimmune serum reactive antigens and fragments thereof include antigenically, epitopically or 
immunologically equivalent derivatives which form a particular aspect of this invention. The term 
"antigenically equivalent derivative" as used herein encompasses a hyperimmune serum reactive antigen 
and fragments thereof or its equivalent which will be specifically recognized by certain antibodies which, 
when raised to the protein or hyperimmtine serum reactive antigen and fragments thereof according to 
the present invention, interfere with the interaction between pathogen and mammalian host. The term 
"innmunologically equivalent derivative" as used herein encompasses a peptide or its equivalent which 
when used in a suitable formulation to raise antibodies in a vertebrate, the antibodies act to interfere with 
the interaction between pathogen and mammalian host. 

The hyperimmune serum reactive antigens and fragments thereof, such as an antigenically or 
immunologically equivalent derivative or a fusion protein thereof can be used as an antigen to immunize 
a mouse or other animal such as a rat or chicken. The fusion protein may provide stability to the 
hyperimmune serum reactive antigens and fragments thereof. The antigen may be associated, for 
example by conjugation, with an immunogenic carrier protein, for example bovine serum albumin (BSA) 
or keyhole limpet haemocyanin (KLH). Alternatively, an antigenic peptide comprising multiple copies of 
the protein or hyperimmune serum reactive antigen and fragments thereof, or an antigenically or 
immunologically equivalent hyperimmune serum reactive antigen and fragments thereof, may be 
sufficiently antigenic to improve immunogenicity so as to obviate the use of a carrier. 

Preferably the antibody or derivative thereof is modified to make it less immunogenic in the individual. 
For example, if the individual is human the antibody may most preferably be "humanized", wherein the 
complimentarity determining region(s) of the hybridoma-derived antibody has been transplanted into a 
human monoclonal antibody, for example as described in {Jones, P. et al., 1986} or {Tempest, P. et al., 
1991}. 

The use of a pol5aiucleotide of the invention in genetic immunization will preferably employ a suitable 
delivery method such as direct injection of plasmid DNA into muscle, delivery of DNA complexed with 
specific protein carriers, coprecipitation of DNA with calcium phosphate, encapsulation of DNA in 
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various forms of liposomes, particle bombardment {Tang, D. et aL, 1992}, {Eisenbraun, M. et aL, 1993} and 
m vivo infection using cloned retroviral vectors {Seeger, C. et aL, 1984}. 

In a further aspect the present invention relates to a peptide binding to any of the hyperimmune serum 

reactive antigens and fragments thereof according to the present invention, and a method for the 
manufacture of such peptides whereby the method is characterized by the use of the hyperimmune 
serum reactive antigens and fragments thereof according to the present invention and the basic steps are 
known to the one skilled in the art. 

Such peptides may be generated by using methods according to the state of the art such as phage display 
or ribosome display. In case of phage display, basically a library of peptides is generated, in form of 
phages, and this kind of library is contacted with the target molecule, in the present case a hyperimmune 
serum reactive antigen and fragments thereof according to the present invention. Those peptides binding 
to the target molecule are subsequently removed, preferably as a complex with the target molecule, from 
the respective reaction. It is known to the one skilled in the art that the binding characteristics, at least to a 
certain extent, depend on the particularly realized experimental set-up such as the salt concentration and 
the like. After separating those peptides binding to the target molecule with a higher affinity or a bigger 
force, from the non-binding members of the library, and optionally also after removal of the target 
molecule from the complex of target molecule and peptide, the respective peptide(s) may subsequently 
be characterised. Prior to the characterisation optionally an amplification step is realized such as, e. g. by 
propagating the peptide coding phages. The characterisation preferably comprises the sequencing of the 
target binding peptides. Basically, the peptides are not limited in their lengths, however, preferably 
peptides having a lengths from about 8 to 20 amino acids are preferably obtained in the respective 
methods. The size of the libraries may be about 10^ to lO^s, preferably lO^ to lO^s different peptides, 
however, is not limited thereto. 

A particular form of target binding hyperimmune serum reactive antigens and fragments thereof are the 
so-called "anticalines" which are, among others, described in German patent application DE 197 42 706. 

In a further aspect the present invention relates to functional nucleic acids interacting with any of the 
hyperimmune serum reactive antigens and fragments thereof according to the present invention, and a 
method for the manufacture of such functional nucleic acids whereby the method is characterized by the 
use of the hyperimmune serum reactive antigens and fragments thereof according to the present 
invention and the basic steps are known to the one skilled in the art. The functional nucleic acids are 
preferably aptamers and spiegelmers. 

Aptamers are D-nucleic acids which are either single stranded or double stranded and which specifically 
interact with a target molecule. The manufacture or selection of aptamers is, e. g., described in European 
patent EP 0 533 838. Basically the following steps are realized. First, a mixture of nucleic acids, i, e. 
potential aptamers, is provided whereby each nucleic acid typically comprises a segment of several, 
preferably at least eight subsequent randomised nucleotides. This mixture is subsequently contacted with 
the target molecule whereby the nucleic acid(s) bind to the target molecule, such as based on an increased 
affinity towards the target or with a bigger force thereto, compared to the candidate mixture. The binding 
nucleic acid(s) are/is subsequently separated from the remainder of the mixture. Optionally, the thus 
obtained nucleic acid(s) is amplified using, e.g. polymerase chain reaction. These steps may be repeated 
several times giving at the end a mixture having an increased ratio of nucleic acids specifically binding to 
the target from which the final binding nucleic acid is then optionally selected. These specifically binding 
nucleic acid(s) are referred to aptamers. It is obvious that at any stage of the method for the generation or 
identification of the aptamers samples of the mixture of individual nucleic acids may be taken to 
determine the sequence thereof using standard techniques. It is within the present invention that the 
aptamers may be stabilized such as, e. g., by introducing defined chemical groups which are known to 
the one skilled in the art of generating aptamers. Such modification may for example reside in the 
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introduction of an amino group at the 2'-position of the sugar moiety of the nucleotides. Aptamers are 
currently used as therapeutical agens. However, it is also within the present invention that the thus 
selected or generated aptamers may be used for target validation and/or as lead substance for the 
development of medicaments, preferably of medicaments based on small molecules. This is actually done 
by a competition assay whereb}?^ the specific interaction between the target molecule and the aptamer is 
inhibited by a candidate drug whereby upon replacement of the aptamer from the complex of target and 
aptamer it may be assumed that the respective drug candidate allows a specific inhibition of the 
interaction between target and aptamer, and if the interaction is specific, said candidate drug will, at least 
in principle, be suitable to block the target and thus decrease its biological availability or activity in a 
respective system comprising such target. The thus obtained small molecule may then be subject to 
further derivatisation and modification to optimise its physical, chemical, biological and/or medical 
characteristics such as toxicity, specificity, biodegradability and bioavailability. 

Spiegelmers and their generation or manufacture is based on a similar principle. The manufacture of 
spiegelmers is described in international patent application WO 98/08856. Spiegelmers are L-nucleic 
acids, which means that they are composed of L-nucleotides rather than D-nucIeotides as aptamers are. 
Spiegelmers are characterized by the fact that they have a very high stability in biological system and, 
comparable to aptamers, specifically interact with the target molecule against which they are directed. In 
the process of generating spiegelmers, a heterogonous population of D-nucleic acids is created and this 
population is contacted with the optical antipode of the target molecule, in the present case for example 
with the D-enantiomer of the naturally occurring L-enantiomer of the hyperimmune serum reactive 
antigens and fragments thereof according to the present invention. Subsequently, those D-nucleic acids 
are separated which do not interact with the optical antipode of the target molecule. But those D-nucleic 
acids interacting with the optical antipode of the target molecule are separated, optionally determined 
and/or sequenced and subsequently the corresponding L-nucleic acids are synthesized based on the 
nucleic acid sequence information obtained from the D-nucleic acids. These L-nucleic acids which are 
identical in terms of sequence with the aforementioned D-nucleic acids interacting with the optical 
antipode of the target molecule, will specifically interact with the naturally occurring target molecule 
rather than with the optical antipode thereof. Similar to the method for the generation of aptamers it is 
also possible to repeat the various steps several times and thus to erurich those nucleic acids specifically 
interacting with the optical antipode of the target molecule. 

In a further aspect the present invention relates to functional nucleic acids interacting with any of the 
nucleic acid molecules according to the present invention, and a method for the manufacture of such 
functional nucleic acids whereby the method is characterized by the use of the nucleic acid molecules and 
their respective sequences according to the present invention and the basic steps are known to the one 
skilled in the art. The functional nucleic acids are preferably ribozymes, antisense oligonucleotides and 
siRNA. 

Ribozymes are catalytically active nucleic acids which preferably consist of RNA which basically 
comprises two moieties. The first moiety shows a catalytic activity whereas the second moiety is 
responsible for the specific interaction with the target nucleic acid, in the present case the nucleic acid 
coding for the hyperimmune serum reactive antigens and fragments thereof according to the present 
invention. Upon interaction between the target nucleic acid and the second moiety of the ribozyme, 
t5q3ically by hybridisation and Watson-Crick base pairing of essentially complementary stretches of bases 
on the two hybridising strands, the catalytically active moiety may become active which means that it 
catalyses, either intramolecularly or intermolecularly, the target nucleic acid in case the catalytic activity 
of the ribozyme is a phosphodiesterase activity. Subsequently, there may be a further degradation of the 
target nucleic acid which in the end results in the degradation of the target nucleic acid as well as the 
protein derived from the said target nucleic acid. Ribozymes, their use and design principles are known 
to the one skilled in the art, and, for example described in {Doherty, E. et al., 2001} and {Lewin, A. et al., 
2001}. 
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The activity and design of antisense oligonucleotides for the manufacture of a medicament and as a 
diagnostic agent, respectively, is based on a similar mode of action. Basically, antisense oligonucleotides 
hybridise based on base complementarity, with a target RNA, preferably with a mRNA, thereby activate 
RNase H. RNase H is activated by both phosphodiester and phosphorothioate-coupled DNA. 
Phosphodiester-coupled DNA, however, is rapidly degraded by cellular nucleases with the exception of 
phosphorothioate-coupled DNA. These resistant, non-naturally occurring DNA derivatives do not inhibit 
RNase H upon hybridisation with RNA. hi other words, antisense polynucleotides are only effective as 
DNA RNA hybride complexes. Examples for this kind of antisense oligonucleotides are described, 
among others, in US-patent US 5,849,902 and US 5,989,912. In other words, based on the nucleic acid 
sequence of the target molecule which in the present case are the nucleic acid molecules for the 
hyperimmune serum reactive antigens and fragments thereof according to the present invention, either 
from the target protein from which a respective nucleic acid sequence may in principle be deduced, or by 
knowing the nucleic acid sequence as such, particularly the mRNA, suitable antisense oligonucleotides 
may be designed base on the principle of base complementarity. 

Particularly preferred are antisense-oligonucleotides which have a short stretch of phosphorothioate 
DNA (3 to 9 bases). A minimum of 3 DNA bases is required for activation of bacterial RNase H and a 
minimum of 5 bases is required for mammalian RNase H activation. In these chimeric oligonucleotides 
there is a central region that forms a substrate for RNase H that is flanked by hybridising "arms" 
comprised of modified nucleotides that do not form substrates for RNase H. The hybridising arms of the 
chimeric oligonucleotides may be modified such as by 2'-0-methyl or 2'-fluoro. Alternative approaches 
used methylphosphonate or phosphoramidate linkages in said arms. Further embodiments of the 
antisense oligonucleotide useful in the practice of the present invention are P-methoxyoligonucleotides, 
partial P-methoxyoligodeox5n:ibonucleotides or P-methoxyoligonucleotides. 

Of particular relevance and usefulness for the present invention are those antisense oligonucleotides as 
more particularly described in the above two mentioned US patents. These oligonucleotides contain no 
naturally occurring 5'->3'-linked nucleotides. Rather the oligonucleotides have two types of nucleotides: 
2'-deoxyphosphorothioate, which activate RNase H, and 2'-modified nucleotides, which do not. The 
linkages between the 2 '-modified nucleotides can be phosphodiesters, phosphorothioate or P- 
ethoxyphosphodiester. Activation of RNase H is accomplished by a contiguous RNase H-activating 
region, which contains between 3 and 5 2'-deoxyphosphorothioate nucleotides to activate bacterial RNase 
H and between 5 and 10 2'- deoxyphosphorothioate nucleotides to activate eucaryotic and, particularly, 
mammalian RNase H. Protection from degradation is accomplished by making the 5' and 3' terminal 
bases highly nuclease resistant and, optionally, by placing a 3' terminal blocking group. 

More particularly, the antisense oligonucleotide comprises a 5' terminus and a 3' terminus; and from 11 

to 59 5'^3'-linked nucleotides independent!)?- selected from the group consisting of 2 '-modified 
phosphodiester nucleotides and 2 '-modified P-alkyloxyphosphotriester nucleotides; and wherein the 5'- 
terminal nucleoside is attached to an RNase H-activating region of between three and ten contiguous 
phosphorothioate-linked deoxyribonucleotides, and wherein the 3'-terminus of said oligonucleotide is 
selected from the group consisting of an inverted deoxyribonucleotide, a contiguous stretch of one to 
three phosphorothioate 2 '-modified ribonucleotides, a biotin group and a P-alkyloxyphosphotriester 
nucleotide. 

Also an antisense oligonucleotide may be used wherein not the 5' terminal nucleoside is attached to an 
RNase H-activating region but the 3' terminal nucleoside as specified above. Also, the 5' terminus is 
selected from the particular group rather than the 3' terminus of said oligonucleotide. 

The nucleic acids as well as the hyperimmune serum reactive antigens and fragments thereof according 
to the present invention may be used as or for the manufacture of pharmaceutical compositions. 
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especially vaccines. Preferably such pharmaceutical composition, preferably vaccine is for the prevention 

or treatment of diseases caused by, related to or associated with S. pyogenes. In so far another aspect of the 
invention relates to a method for inducing an immunological response in an individual, particularly a 
mammal, which comprises inoculating the individual with the hyperimmune serum reactive antigens 

and fragments thereof of the invention, or a fragment or variant thereof, adequate to produce antibodies 
to protect said individual from infection, particularly Streptococcus infection and most particularly S. 
pyogenes infections. 

Yet another aspect of the invention relates to a method of inducing an immunological response in an 
individual which comprises, through gene therapy or otherwise, delivering a nucleic acid functionally 
encoding h3T5erimmune serum reactive antigens and fragments thereof, or a fragment or a variant 
thereof, for expressing the hyperimmune serum reactive antigens and fragments thereof, or a fragment or 
a variant thereof zn vivo in order to induce an immunological response to produce antibodies or a cell 
mediated T cell response, either cytokine-producing T cells or cytotoxic T cells, to protect said individual 
from disease, whether that disease is already established within the individual or not. One way of 
administering the gene is by accelerating it into the desired cells as a coating on particles or otherwise. 

A further aspect of the invention relates to an immimological composition which, when introduced into a 
host capable of having induced within it an immunological response, induces an immunological response 
in such host, wherein the composition comprises recombinant DNA which codes for and expresses an 
antigen of the hyperimmune serum reactive antigens and fragments thereof of the present invention. The 
immunological response may be used therapeutically or prophylactically and may take the form of 
antibody immunity or cellular immunity such as that arising from CTL or CD4+ T cells. 

The hyperimmune serum reactive antigens and fragments thereof of the invention or a fragment thereof 
may be fused with a co-protein which may not by itself produce antibodies, but is capable of stabilizing 
the first protein and producing a fused protein which will have immunogenic and protective properties. 
This fused recombinant protein preferably further comprises an antigenic co-protein, such as 
Glutathione-S-transferase (GST) or beta-galactosidase, relatively large co-proteins which solubilise the 
protein and facilitate production and purification thereof. Moreover, the co-protein may act as an 
adjuvant in the sense of providing a generalized stimulation of the immune system. The co-protein may 
be attached to either the amino or carboxy terminus of the first protein. 

Also, provided by this invention are methods using the described nucleic acid molecule or particular 
fragments thereof in such genetic immtmization experiments in animal models of infection with S. 
pyogenes. Such fragments will be particularly useful for identifying protein epitopes able to provoke a 
prophylactic or therapeutic immune response. This approach can allow for the subsequent preparation of 
monoclonal antibodies of particular value from the requisite organ of the animal successfully resisting or 
clearing infection for the development of prophylactic agents or therapeutic treatments of S. pyogenes 
infection in mammals, particularly humans. 

The hyperimmune serum reactive antigens and fragments thereof may be used as an antigen for 
vaccination of a host to produce specific antibodies which protect against invasion of bacteria, for 
example by blocking adherence of bacteria to damaged tissue. Examples of tissue damage include 
woimds in skin or connective tissue caused e.g. by mechanical, chemical or thermal damage or by 
implantation of indwelling devices, or wounds in the mucous membranes, such as the mouth, mammary 
glands, urethra or vagina. 

The present invention also includes a vaccine formulation which comprises the immunogenic 
recombinant protein together with a suitable carrier. Since the protein may be broken down in the 
stomach, it is preferably administered parenteraily, including, for example, administration that is 
subcutaneous, intramuscular, intravenous, or intradermal. Formulations suitable for parenteral 
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administration include aqueous and non-aqueous sterile injection solutions which may contain anti- 
oxidants, buffers, bacteriostats and solutes which render the formulation isotonic with the bodily fluid, 
preferably the blood, of the individual; and aqueous and non-aqueous sterile suspensions which may 
include suspending agents or thickening agents. The formulations may be presented in unit-dose or 

multi-dose containers, for example, sealed ampoules and vials, and may be stored in a freeze-dried 
condition requiring only the addition of the sterile liquid carrier immediately prior to use. The vaccine 
formulation may also include adjuvant systems for enhancing the immunogenicity of the formulation, 
such as oil-in-water systems and other systems known in the art. The dosage will depend on the specific 
activity of the vaccine and can be readily determined by routine experimentation. 

According to another aspect, the present invention relates to a pharmaceutical composition comprising 

such a hyperimmune serum-reactive antigen or a fragment thereof as provided in the present invention 
for S. pyogenes. Such a pharmaceutical composition may comprise one or more hyperimmune serum 
reactive antigens or fragments thereof against S. pyogenes. Optionally, such S, pyogenes hyperimmune 
serum reactive antigens or fragments thereof may also be combined with antigens against other 
pathogens in a combination pharmaceutical composition. Preferably, said pharmaceutical composition 
is a vaccine for preventing or treating an infection caused by S. pyogenes and/or other pathogens against 
which the antigens have been included in the vaccine. 

According to a further aspect, the present invention relates to a pharmaceutical composition comprising a 
nucleic acid molecule encoding a hyperimmime serum-reactive antigen or a fragment thereof as 
identified above for S. pyogenes. Such a pharmaceutical composition may comprise one or more nucleic 
acid molecules encoding hyperimmune serum reactive antigens or fragments thereof against S. pyogenes. 
Optionally, such S. pyogenes nucleic acid molecules encoding hyperimmune serum reactive antigens or 
fragments thereof may also be combined with nucleic acid molecules encoding antigens against other 
pathogens in a combination pharmaceutical composition. Preferably, said pharmaceutical composition is 
a vaccine for preventing or treating an infection caused by S. pyogenes and/or other pathogens against 
which the antigens have been included in the vaccine. 

The pharmaceutical composition may contain any suitable auxiliary substances, such as buffer 
substances, stabilisers or further active ingredients, especially ingredients known in connection of 
pharmaceutical composition and/or vaccine production. 

A preferable carrier/or excipient for the hyperimmune serum-reactive antigens, fragments thereof or a 
coding nucleic acid molecule thereof according to the present invention is an immunostimulatory 
compound for further stimulating the immune response to the given hyperimmune serum-reactive 
antigen, fragment thereof or a coding nucleic acid molecule thereof. Preferably the immunostimulatory 
compound in the pharmaceutical preparation according to the present invention is selected from the 
group of polycationic substances, especially polycationic peptides, immunostimulatory nucleic acids 
molecules, preferably immunostimulatory deoxynucleotides, alum, Freund's complete adjuvants, 
Freimd's incomplete adjuvants, neuroactive compounds, especially human growth hormone, or 
combinations thereof. 

It is also within the scope of the present invention that the pharmaceutical composition, especially 
vaccine, comprises apart from the hyperimmune serum reactive antigens, fragments thereof and/or 
coding nucleic acid molecules thereof according to the present invention other compounds which are 
biologically or pharmaceutically active. Preferably, the vaccine composition comprises at least one 
polycationic peptide. The polycationic compound(s) to be used according to the present invention may be 
any polycationic compoimd which shows the characteristic effects according to the WO 97/30721. 
Preferred polycationic compounds are selected from basic polyppetides, organic polycations, basic 
polyamino acids or mixtures thereof. These polyamino acids should have a chain length of at least 4 
amino acid residues (WO 97/30721). Especially preferred are substances like polylysine, polyarginine and 
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polypeptides containing more than 20 %, especially more than 50 % of basic amino acids in a range of 
more than 8, especially more than 20, amino acid residues or mixtures thereof. Other preferred 
polycations and their pharmaceutical compositions are described in WO 97/30721 (e.g. 
polyethyleneimine) and WO 99/38528. Preferably these polypeptides contain between 20 and 500 amino 
acid residues, especially between 30 and 200 residues. 

These polycationic compounds may be produced chemically or recombinantly or may be derived from 
natural sources. 

Cationic (poly)peptides may also be anti-microbial with properties as reviewed in {Ganz, T., 1999}. These 
(poly)peptides may be of prokaryotic or animal or plant origin or may be produced chemically or 

recombinantly (WO 02/13857). Peptides may also belong to the class of defensins (WO 02/13857). 
Sequences of such peptides can be, for example, be found in the Antimicrobial Sequences Database under 
the following internet address: 

http://www.bbcm.univ.trieste.it/-tossi/pag2.html 

Such host defence peptides or defensives are also a preferred form of the polycationic polymer according 
to the present invention. Generally, a compound allowing as an end product activation (or down- 
regulation) of the adaptive immune system, preferably mediated by APCs (including dendritic cells) is 
used as polycationic poljrmer. 

Especially preferred for use as polycationic substances in the present invention are cathelicidin derived 
antimicrobial peptides or derivatives thereof (International patent application WO 02/13857, incorporated 
herein by reference), especially antimicrobial peptides derived from mammal cathelicidin, preferably 
from hirnian, bovine or mouse. 

Polycationic compoimds derived from natural sources include HIV-REV or HIV-TAT (derived cationic 
peptides, antennapedia peptides, chitosan or other derivatives of chitin) or other peptides derived from 
these peptides or proteins by biochemical or recombinant production. Other preferred polycationic 
compounds are cathelin or related or derived substances from cathelin. For example, mouse cathelin is a 
peptide which has the amino acid sequence NH2-RLAGLLRKGGEKIGEKLKKIGOKIKNFFQKLVPQPE- 
COOH. Related or derived cathelin substances contain the whole or parts of the cathelin sequence with at 
least 15-20 amino acid residues. Derivations may include the substitution or modification of the natural 
amino acids by amino acids which are not among the 20 standard amino acids. Moreover, further cationic 
residues may be introduced into such cathelin molecules. These cathelin molecules are preferred to be 
combined with the antigen. These cathelin molecules surprisingly have turned out to be also effective as 
an adjuvant for a antigen without the addition of further adjuvants. It is therefore possible to use such 
cathelin molecules as efficient adjuvants in vaccine formulations with or without further 
immunactivating substances. 

Another preferred polycationic substance to be used according to the present invention is a synthetic 
peptide containing at least 2 KLK-motifs separated b}^ a linker of 3 to 7 hydrophobic amino acids 
(International patent application WO 02/32451, incorporated herein by reference). 

The pharmaceutical composition of the present invention may fxxrther comprise immunostimulatory 
nucleic acid(s). Immxmostimulatory nucleic acids are e. g. neutral or artificial CpG containing nucleic 
acid, short stretches of nucleic acid derived from non-vertebrates or in form of short oligonucleotides 
(ODNs) containing non-methylated cytosine-guanine di-nucleotides (CpG) in a certain base context (e.g. 
described in WO 96/02555). Aiternativety, also nucleic acids based on inosine and cytidine as e.g. 
described in the WO 01/93903, or deoxynucleic acids containing deoxy-inosine and/or deoxyuridine 
residues (described in WO 01/93905 and PCT/EP 02/05448, incorporated herein by reference) may 
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preferably be used as immunostimulatory nucleic acids for the present invention. Preferablly, the 
mixtures of different immunostimulatory nucleic acids may be used according to the present invention. 

It is also within the present invention that any of the aforementioned polycationic compounds is 

combined with any of the immunostimulatory nucleic acids as aforementioned. Preferably^ such 
combinations are according to the ones as described in WO 01/93905, WO 02/32451, WO 01/54720, WO 
01/93903, WO 02/13857 and PCT/EP 02/05448 and the Austrian patent application A 1924/2001, 
incorporated herein by reference. 

In addition or alternatively such vaccine composition may comprise apart from the hyperimmune serum 
reactive antigens and fragments thereof, and the coding nucleic acid molecules thereof according to the 
present invention a neuroactive compound. Preferably, the neuroactive compound is human growth 
factor as, e.g. described in WO 01/24822, Also preferably, the neuroactive compound is combined with 
any of the polycationic compounds and/or immunostimulatory nucleic acids as afore-mentioned. 

In a further aspect the present invention is related to a pharmaceutical composition. Such pharmaceutical 
composition is, for example, the vaccine described herein. Also a pharmaceutical composition is a 
pharmaceutical composition which comprises any of the following compotmds or combinations thereof: 
the nucleic acid molecules according to the present invention, the hyperimmune serum reactive antigens 
and fragments thereof according to the present invention, the vector according to the present invention, 
the cells according to the present invention, the antibody according to the present invention, the 
functional nucleic acids according to the present invention and the binding peptides such as the 
anticalines according to the present invention, any agonists and antagonists screened as described herein. 
In connection therewith any of these compounds may be employed in combination with a non-sterile or 
sterile carrier or carriers for use with cells, tissues or organisms, such as a pharmaceutical carrier suitable 
for administration to a subject. Such compositions comprise, for instance, a media additive or a 
therapeutically effective amount of a hyperimmune serum reactive antigen and fragments thereof of the 
invention and a pharmaceutically acceptable carrier or excipient. Such carriers may include, but are not 
limited to, saline, buffered saline, dextrose, water, glycerol, ethanol and combinations thereof. The 
formulation should suit the mode of administration. 

The pharmaceutical compositions may be administered in any effective, convenient manner including, 
for instance, administration by topical, oral, anal, vaginal, intravenous, intraperitoneal, intramuscular, 
subcutaneous, intranasal or intradermal routes among others. 

In therapy or as a prophylactic, the active agent may be administered to an individual as an injectable 
composition, for example as a sterile aqueous dispersion, preferably isotonic. 

Alternatively the composition may be formulated for topical application, for example in the form of 
ointments, creams, lotions, eye ointments, eye drops, ear drops, mouthwash, impregnated dressings and 
sutures and aerosols, and may contain appropriate conventional additives, including, for example, 
preservatives, solvents to assist drug penetration, and emollients in ointments and creams. Such topical 
formulations may also contain compatible conventional carriers, for example cream or ointment bases, 
and ethanol or oleyl alcohol for lotions. Such carriers may constitxite from about 1 % to about 98 % by 
weight of the formulation; more usually they will constitute up to about 80 % by weight of the 
formulation. 

In addition to the therapy described above, the compositions of this invention may be used generally as a 
wound treatment agent to prevent adhesion of bacteria to matrix proteins exposed in wound tissue and 
for prophylactic use in dental treatment as an alternative to, or in conjunction with, antibiotic 
prophylaxis. 
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A vaccine composition is conveniently in injectable form. Conventional adjuvants may be employed to 
enhance the immune response. A suitable unit dose for vaccination is 0.05-5 |-Lg/kg of antigen, and such 
dose is preferably administered 1-3 times and with an interval of 1-3 weeks. 

With the indicated dose range, no adverse toxicological effects should be observed with the compounds 
of the invention v/hich would preclude their administration to suitable individuals. 

In a further embodiment the present invention relates to diagnostic and pharmaceutical packs and kits 
comprising one or more containers filled with one or more of the ingredients of the aforementioned 
compositions of the invention. The ingredient(s) can be present in a useful amount, dosage, formulation 
or combination. Associated with such container(s) can be a notice in the form prescribed by a 
governmental agency regulating the manufacture, use or sale of pharmaceuticals or biological products, 
reflecting approval by the agency of the manufacture, use or sale of the product for human 
administration. 

In connection with the present invention any disease related use as disclosed herein such as, e. g, use of 
the pharmaceutical composition or vaccine, is particularly a disease or diseased condition which is 
caused by, linked or associated with Streptococci, more preferably, S. pyogenes. In connection therewith it 
is to be noted that S, pyogenes comprises several strains including those disclosed herein. A disease 
related, caused or associated with the bacterial infection to be prevented and/or treated according to the 
present invention includes besides others bacterial pharyngitis, scarlet fever, impetigo, rheumatic fever, 
necrotizing fasciitis and sepsis in humans. 

In a still further embodiment the present invention is related to a screening method using any of the 
hyperimmune serum reactive antigens or nucleic acids according to the present invention. Screening 
methods as such are known to the one skilled in the art and can be designed such that an agonist or an 
antagonist is screened. Preferably an antagonist is screened which in the present case inhibits or prevents 
the binding of any hyperimmune serum reactive antigen and fragment thereof according to the present 
invention to an interaction partner. Such interaction partner can be a naturally occurring interaction 
partner or a non-naturally occurring interaction partner. 

The invention also provides a method of screening compoxmds to identify those which enhance (agonist) 
or block (antagonist) the function of hyperimmune serum reactive antigens and fragments thereof or 
nucleic acid molecules of the present invention, such as its interaction with a binding molecule. The 
method of screening may involve high-throughput. 

For example, to screen for agonists or antagonists, the interaction partner of the nucleic acid molecule and 
nucleic acid, respectively, according to the present invention, maybe a synthetic reaction mix, a cellular 
compartment, such as a membrane, cell envelope or cell wall, or a preparation of any thereof, may be 
prepared from a cell that expresses a molecule that binds to the hyperimmune serum reactive antigens 
and fragments thereof of the present invention. The preparation is incubated with labelled hyperimmune 
serum reactive antigens and fragments thereof in the absence or the presence of a candidate molecule 
which may be an agonist or antagonist. The ability of the candidate molecule to bind the binding 
molecule is reflected in decreased binding of the labelled ligand. Molecules which bind gratuitously, i. e., 
without inducing the functional effects of the h3^erimmune serum reactive antigens and fragments 
thereof, are most likely to be good antagonists. Molecules that bind well and elicit functional effects that 
are the same as or closely related to the hyperimmime serum reactive antigens and fragments thereof are 
good agonists. 

The functional effects of potential agonists and antagonists may be measured, for instance, by 
determining the activity of a reporter system following interaction of the candidate molecule with a cell 
or appropriate cell preparation, and comparing the effect with that, of the hyperimmune serum reactive 
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antigens and fragments thereof of the present invention or molecules that elicit the same effects as the 
hyperimmune serum reactive antigens and fragments thereof. Reporter systems that may be useful in the 
regard include but are not limited to colorimetric labelled substrate converted into product, a reporter 
gene that is responsive to changes in the functional activity of the h3rperimmune servun reactive antigens 
and fragments thereof, and binding assays known in the art. 

Another example of an assay for antagonists is a competitive assay that combines the hyperimmune 
serum reactive antigens and fragments thereof of the present invention and a potential antagonist with 
membrane-bound binding molecules^ recombinant binding molecules, natural substrates or ligands, or 
substrate or ligand mimetics, under appropriate conditions for a competitive inhibition assay. The 
hyperimmune serum reactive antigens and fragments thereof can be labelled such as by radioactivity or a 
colorimetric compound, such that the molecule number of h}7perimmune serum reactive antigens and 
fragments tlaereof bound to a binding molecule or converted to product can be determined accurately to 
assess the effectiveness of the potential antagonist. 

Potential antagonists include small organic molecules, peptides, pol3^peptides and antibodies that bind to 
a hyperimmune serum reactive antigen and fragments thereof of the invention and thereby inhibit or 
extinguish its adtivity. Potential antagonists also may be small organic molecules, a peptide, a 
polypeptide such as a closely related protein or antibody that binds to the same sites on a binding 
molecule without inducing functional activity of the hyperimmune serum reactive antigens and 
fragments thereof of the invention. 

Potential antagonists include a small molecule which binds to and occupies the binding site of the 
hyperimmune serum reactive antigens and fragments thereof thereby preventing binding to cellular 
binding molecules, such that normal biological activity is prevented. Examples of small molecules 
include but are not limited to small organic molecules, peptides or peptide-like molecules. Other 
potential antagonists include antisense molecules. 

Other potential antagonists include antisense molecules (see {Okano, H. et al., 1991}; 
OLIGODEOXYNUCLEOHDES AS ANTISENSE INHIBITORS OF GENE EXPRESSION; CRC Press, Boca 
Ration, FL (1988), for a description of these molecules). 

Preferred potential antagonists include derivatives of the hyperimmune serum reactive antigens and 
fragments thereof of the invention. 

As used herein the activity of a hyperimmune serum reactive antigen and fragment thereof according to 
the present invention is its capability to bind to any of its interaction partner or the extent of such 
capability to bind to its or any interaction partner. 

In a particular aspect, the invention provides the use of the hyperimmune serum reactive antigens and 
fragments thereof, nucleic acid molecules or inhibitors of the invention to interfere with the initial 
physical interaction between a pathogen and mammalian host responsible for sequelae of infection. In 
particular the molecules of the invention may be used: i) in the prevention of adhesion of S. pyogenes to 
mammalian extracellular matrix proteins on in-dwelling devices or to extracellular matrix proteins in 
wounds; ii) to block protein mediated mammalian cell invasion by, for example, initiating 
phosphorylation of mammalian tyrosine kinases. {Rosenshine, 1. et al., 1992}to block bacterial adhesion 
between mammalian extracellular matrix proteins and bacterial proteins which mediate tissue damage; 
iv) to block the normal progression of pathogenesis in infections initiated other than by the implantation 
of in-dwelling devices or by other surgical techniques. 

Each of the DNA coding sequences provided herein may be used in the discovery and development of 
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antibacterial compounds. The encoded protein upon expression can be used as a target for the screening 
of antibacterial drugs. Additionally, the DNA sequences encoding the amino terminal regions of the 
encoded protein or Shine-Delgarno or other translation facilitating sequences of the respective mRNA can 
be used to construct antisense sequences to control the expression of the coding sequence of interest. 

The antagonists and agonists may be employed^ for instance, to inhibit diseases arising from infection 
with Streptococcus, especially S, pyogenes, such as sepsis. 

In a still further aspect the present invention is related to an affinity device such affinit}^ device comprises 
as least a support material and any of the hyperimmune serum reactive antigens and fragments thereof 
according to the present invention which is attached to the support material. Because of the specificity of 
the h5^erimmime serum reactive antigens and fragments thereof according to the present invention for 
their target cells or target molecules or their interaction partners, the hyperimmune serum reactive 
antigens and fragments thereof allow a selective removal of their interaction partner(s) from any kind of 
sample applied to the support material provided that the conditions for binding are met. The sample may 
be a biological or medical sample, including but not limited to, fermentation broth, cell debris, cell 
preparation, tissue preparation, organ preparation, blood, urine, lymph liquid, liquor and the like. 

The hyperimmime serum reactive antigens and fragments thereof may be attached to the matrix in a 
covalent or non-covalent manner. Suitable support material is known to the one skilled in the art and can 
be selected from the group comprising cellulose, silicon, glass, aluminium, paramagnetic beads, starch 
and dextrane. 

The present invention is further illustrated by the following figures, examples and the sequence listing 
from which further features, embodiments and advantages may be taken. It is to be xmderstood that the 
present examples are given by way of illustration only and not by way of limitation of the disclosure. 

In connection with the present invention 

Figure 1 shows the characterization of S. pyogenes specific human sera. 

Figure 2 shows the characterization of the small fragment genomic library, LSPy-70, from Streptococcus 
pyogenes SF370/M1. 

Figure 3 shows the selection of bacterial cells by MACS using biotinylated human IgGs. 
Figure 4 shows an example for the gene distribution study with the identified antigens. 
Figure 5 shows cell surface staining by flow cytometry. 

Figure 6 shows the protective value of identified recombinant S. pyogenes antigens. 

Table 1 shows the summary of all screens performed with genomic S. pyogenes libraries and human 
serum. 

Table 2 shows the epitope serology with human sera. 

Table 3 shows the summary of the gene distribution analysis for the identified antigens in fifty S. pyogenes 
strains. 

Table 4 summarizes the information on the antigenic proteins used for theimmunization experiments. 
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Table 5 shows the variability of antigenic proteins in six different strains of S. pyogenes. 

The figures to which it might be referred to in the specification are described in the following in more 
details. 

Figure 1 shows the characterization of human sera for S. pyogenes as measured by ELISA. 

Figure 2 shows the fragment size distribution of the Streptococcus pyogenes SF370/M1 small fragment 
genomic library, LSPy-70. After sequencing 576 randomly selected clones sequences were trimmed to 
eliminate vector residues and the number of clones with various genomic fragment sizes were plotted. 
(B) Graphic illustration of the distribution of the same set of randomly sequenced clones of LSPy-70 over 
tlie S, pyogenes chromosome. Blue circles indicate matching sequences to annotated ORFs in +/+ 
orientation. Red rectangles represent fully matched clones to non-coding chromosomal sequences in +/+ 
orientation. Green diamonds positions all clones with complementary or chimeric sequences. Numeric 
distances in base pairs are indicated over each circular genome for orientation. Partitioning of various 
clone sets within the library is given in numbers and percentage at the bottom of the figure. 

Figure 3A shows the MACS selection with biotinylated human IgGs. The LSPy-70 library in pMAL9.1 
was screened with 10 |Lig biotinylated, human serum (P4-IgG) in the first and with 1 |Lig in the second 
selection round. As negative control, no serum was added to the library cells for screening. Number of 

cells selected after the 1^* and 2^^ elution are shown for each selection round. Figure 3B shows the 
reactivity of specific clones (1-52) isolated by bacterial surface display as analysed by Western blot 
analysis with the human serum (P4-IgG) used for selection by MACS at a dilution of 1:3,000. As a loading 
control the same blot was also analysed with antibodies directed against the platform protein LamB at a 
dilution of 1:5,000. LB, Extract from a clone expressing LamB without foreign peptide insert. 

Figure 4A shows the emm types of S, pyogenes analysed for the gene distribution study. Figure 4B shows 
the PCR analysis for the gene distribution of genes Spy0269 with the respective oligonucleotides. The 
predicted size of the PCR fragments is 1,000 bp. 1-50, S. pyogenes strains as listed under A; N, no genomic 
DNA added; P, genomic DNA from S. pyogenes SF310, which served as template for library construction. 

Figure 5 Detection of specific antibody binding on the cell surface of Group A Streptococcus by flow 
cytometry. In Figure 5A preimmune mouse sera and polyclonal sera raised against S.pyogenes lysate were 
incubated with S. pyogenes strain SF370/M1 and analysed by flow cytometry. Control represents the level 
of non-specific binding of the secondary antibody to the surface of S.pyogenes cells. The histograms in 
figure 5B and 5C indicate the increased fluorescence due to specific binding of anti-Spy0012 (B) or anti- 
Spyl315 and anti-Spyl798 (C) antibodies in comparison to the control sera against the two platform 
proteins LamB and FhuA, respectively. 

Figure 6 NMRI mice were immunized with 3 consecutive doses of recombinant protein (50|j.g/dose) two weeks 
apart on days 0, 14 and 28. As negative control, mice were immunized with PBS in the presence of adjuvant. The 
Ml protein (Spy2018) served as positive control for the challenge experiment. The bacterial challenge was 
performed with 5x10^ S. pyogeites API cells i.v. and survival of mice was observed daily for A) 18 days, B) 21 
days and C) 19 days, respectively. 

Table 1: Immunogenic proteins identified by bacterial surface display. 

A, LSPy-70 library in lamB with IC3-IgG (1588), B, LSPy-70 library in lamB with IC3-IgA (1539), C, LSPy- 
70 library in lamB with IC6-IgG (1173), D, LSPy-70 library in lamB with P4-IgG (1138), E, LSPy-70 library 
in lamB with P4-IgA (981), F, LSPy-150 library in btuB with IC3-IgG (991), G, LSPy-150 library in btuB 
with IC6-IgG.(1036), H, LSPy-150 library in btuB with P4-IgG (681), I, LSPy-400 library in fhuA with IC3- 
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IgG (559), K, LSPy-400 library in fhuA with IC6-IgG (543), L, LSPy-400 Hbrary in fhuA with P4-IgG (20), * 
prediction of antigenic sequences longer than 5 amino acids was performed with the program 
ANTIGENIC {Kolaskar, A. et al., 1990}. 

Table 2: Epitope serology with human sera. 

Immune reactivity of individual S3nithetic peptides representing selected epitopes witli individual human 
sera is shown. Extent of reactivity is pattern/grey coded; white: - (<50U), grey: + (50-119U), diagonal: ++ 
(120-199U), diagonally crossed: +++ (200-lOOOU) and vertically crossed: -h-h- (> lOOOU). ELISA xmits (U) 
are calculated from OD545nm readings and the serum dilution after correction for background. Score^ sum 
of all reactivities (addition of the number of all +); PI to PIO sera are from patients with acute pharyngitis 
and Nl to NIO sera are from healthy adults. P and N are used as internal controls. 

Peptide names: SPO0012, annotated ORF Spy0012; SPA0450, potential novel ORE in alternative reading- 
frame of Spy0450; SPC0406, potential novel ORE on complement of Spy0406; SPNOOOl, potential novel 
ORF in non-coding region. 

Table 3: Gene distribution in S. pyogenes strains. 

Fifty S. pyogenes strains as shown in Figure 4A were tested by PGR with oligonucleotides specific for the 
genes encoding relevant antigens. The PGR fragment of one selected PGR fragment was sequenced in 
order to confirm the amplification of the correct DNA fragment. *, number of amino acid substitutions in 
strain M89 as compared to S. pyogenes SF370 (Ml). #, alternative strain used for sequencing, because gene 
was not present in M89. 

Table 4: Recombinant proteins used for immunisation experiments in NMRI mice. 

Immunization with recombinant antigens and challenge with pathogenic S. pyogenes API was performed as 
described under Experimental procedures. A, The amino acids of the respective antigen contained within the 
recombinant protein as used for the immunization experiments in animals are given in relation to the full- 
length protein. B, Percentage of survival is represented as protection and parentheses describes the percentage 
of protection of the negative control (PBS immunized) followed by the percentage of protection of the positive 
control (Spy2018). C, Spy 0269 was selected due to the fact that the mice showed better survival although at the 
end of the observation time all mice died. This is reflected by the average survival time as measured in days: 
14.6 (Spy0269), 11.6 (PBS) and 19.3 days (Spy2018), 

Table 5: Sequence variation of antigenic proteins from S. pyogenes. 

Antigenic proteins were analysed for amino acid exchanges in six different S. pyogenes strains as listed imder 
experimental procedures. The residue number indicates the position of the amino acid in the full-length protein. 
In case of Spyl666, changes relative to a homologous gene in Streptococcus pneumoniae TIGR4 (SP0334) are listed, 
because the gene is highly conserved in S. pyogenes as well as S. pneumoniae. A, amino acid residue in protein 
from S. pyogenes SF370. B, amino acid residue(s), which may occur in any one the analysed genes from the other 
five S. pyogenes strains, if different from S. pyogenes SF370. C, residues of Spy 041 6 involved in catalytic activity. 
Ghanges in these residues are anticipated to render the enzyme inactive and are therefore exchanged 
experimentally with alanine, serine, threonine of glycine to produce an enzymatically inactive recombinant 
protein. 

EXAMPLES 

Example 1: Preparation of antibodies from human serum 

The antibodies produced against group A streptococci by the human immune system and present in 
human sera are indicative of the in vivo expression of the antigenic proteins and their immunogenicity. 
These molecules are essential for the identification of individual antigens in the approach as described in 
the present invention, which is based on the interaction of the specific anti-streptococcal antibodies and 
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the corresponding S. pyogenes peptides or proteins. To gain access to relevant antibody repertoires, 
human sera were collected from 

I. patients with acute S, pyogenes infections, such as pharyngitis, wound infection and 
bacteraemia. (S. pyogenes was shown to be the causative agent by medical microbiological tests), 

II. uninfected healthy adults^ since group A streptococcal infections are common, and antibodies 
are present as a consequence of natural immunization from previous encoimters with streptococci. 

The sera were characterized for anti-S. pyogenes antibodies by a series of ELISA and immunoblotting 
assays. Several streptococcal antigens have been used to show that the titers measured were not a result 
of the sum of cross-reactive antibodies. For that purpose two different antigen preparation were used: 
whole cell extract or culture supernatant proteins prepared from S. pyogenes SF370/M1 cultured overnight 
(stationary phase) in THB (Todd-Hewitt Broth) growth medium. Both IgG and IgA antibody levels were 
determined. Sera were selected for further analysis by immunoblotting based on total antibody titers 
against the two antigen preparations. 

The titers were compared at given dilutions where the response was linear (Figure 1). Sera were 
ranked based on the reactivity against multiple streptococcal components, and the highest ones were 
selected for further analysis by immunoblotting. This extensive antibody characterization approach has 
led to the unambiguous identification of anti-streptococcal hyperimmime sera. 

Recently it was reported that not only IgG, but also IgA serum antibodies can be recognized by the FcRIII 
receptors of PMNs and promote opsonization {Phillips-Quagliata, J. et aL, 2000; Shibuya, A. et al., 2000}. 
The primary role of IgA antibodies is neutralization, mainly at the mucosal surface. The level of sertim 
IgA reflects the quality, quantity and specificity of the dimeric secretory IgA. For that reason the serum 
collection was not only analyzed for anti-streptococcal IgG, but also for IgA levels. In the ELISA assays 
highly specific secondary reagents were used to detect antibodies from the high affinity types, such as 
IgG and IgA, but avoided IgM. Production of IgM antibodies occurs during the primary adaptive 
humoral response, and results in low affinity antibodies, while IgG and IgA antibodies had already 
tmdergone affinity maturation, and are more valuable in fighting or preventing disease 

Experimental procedures 

Peptide synthesis 

Peptides were synthesized in small scale (4 mg resin; up to 288 in parallel) using standard F-moc 
chemistry on a Rink amide resin (PepChem, Tubingen, Germany) using a SyroII sjmthesizer 
(Multis5mtech, Witten, Germany). After the sequence was assembled, peptides were elongated with 
Fmoc-epsilon-aminohexanoic acid (as a linker) and biotin (Sigma, St. Louis, MO; activated like a normal 
amino acid). Peptides were cleaved off the resin with 93%TFA, 5% triethylsilane, and 2% water for one 
hour. Peptides were dried under vacuum and freeze dried three times from acetonitrile/water (1:1). The 
presence of the correct mass was verified by mass spectrometry on a Reflex HI MALDI-TOF (Bruker, 
Bremen Germany). The peptides were used without further purification. 

Enzyme linked immune assay (ELISA). 

For serum characterization: ELISA plates (Maxisorb, Millipore) were coated with 5-10 |Lig/ml total protein 
diluted in coating buffer (O.IM sodium carbonate pH 9.2). Three dilutions of sera (2,000X, 10,000X, 

50,000X) were made in PBS-BSA. 

For pepti de serology: Biotin-labeled peptides were coating on Streptavidin ELISA plates (EXICON) at 10 
|ag/ml concentration according to the manufacturer's instructions. Sera were tested at two dilutions, 200X 
and 1,000X. 

Highly specific Horse Radish Peroxidase (HRP)-conjugated anti-human IgG or anti-human IgA 
secondary antibodies (Southern Biotech) were used according to the manufacturers' recommendations 
(dilution: l,000x). Antigen-antibody complexes were quantified by measuring the conversion of the 
substrate (ABTS) to colored product based on OD405nm readings in an automated ELISA reader (TECAN 
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SUNRISE). Following manual coating, peptide plates were processed and analyzed by the Gemini 160 
ELISA robot (TECAN) with a built-in reader (GENIOS, TECAN). 

Immunoblottiftg 

Total bacterial tysate and culture supernatant samples were prepared from in vitro grown S. pyogenes 
SF370/M1. 10 to 25|LXg total protein/lane was separated by SDS-PAGE using the BioRad Mini-Protean 3 
Cell electrophoresis system and proteins transferred to nitrocellulose membrane (ECL, Amersham 
Pharmacia). After overnight blocking in 5% milk, antisera at 2,000x dilution were added, and HRPO 
labeled anti-mouse IgG was used for detection. 

Preparation of bacterial antigen extracts 

Total bacterial lysate: Bacteria were lysed by repeated freeze-thaw cycles: incubation on dry ice/ethanol- 
mixture until frozen (1 min), then thawed at 37^(1 (5 min): repeated 3 times. This was followed by 
sonication and collection of supernatant by centrifugation (3,500 rpm, 15 min, 40C). 

Culture supernatant: After removal of bacteria, the supernatant of overnight grown bacterial cultures was 
precipitated with ice-cold ethanol (100%): 1 part supernatant/3 parts ethanol incubated o/n at -20°C. 
Precipitates were collected by centrifugation (2,600 g, for 15 min) and dried. Dry pellets were dissolved 
either in PBS for ELISA, or in urea and SDS-sample buffer for SDS-PAGE and immunoblotting. The 
protein concentration of samples was determined by Bradford assay. 

Purification of antibodies for gettomic screening. Five sera from both the patient and the non-infected group 
were selected based on the overall anti-streptococcal titers for a serum pool used in the screening 
procedure. Antibodies against £. coli proteins were removed by incubating the heat-inactivated sera with 
whole cell E, coli cells (DHSalpha, transformed with pHIEll, grown under the same condition as used for 
bacterial surface display). Highly enriched preparations of IgGs from the pooled, depleted sera were 
generated by protein G affinity chromatography, according to the manufacturer's instructions (UltraLink 
Immobilized Protein G, Pierce). IgA antibodies were purified also by affinity chromatography using 
biotin-labeled anti-human IgA (Southern Biotech) immobilized on Streptavidin-agarose (GIBCO BRL). 
The efficiency of depletion and purification was checked by SDS-PAGE, Western blotting, ELISA and 
protein concentration measurements. 

Example 2: Generation of highly random, frame-selected, small-fragment, genomic DNA libraries of 
Streptococcus pyogenes 

Experimental procedures 

Preparation of streptococcal genomic DNA, 50 ml Todd-Hewitt Broth medium was inoculated with S. 
pyogenes SF370/M1 bacteria from a frozen stab and grown with aeration and shaking for 18 h at 37°C. The 
culture was then harvested, centrifuged with l,600x g for 15 min and the supernatant was removed. 
Bacterial pellets were washed 3 x with PBS and carefully re-suspended in 0.5 ml of Lysozyme solution 
(100 mg/ml). 0.1 ml of 10 mg/ml heat treated RNase A and 20 U of RNase Tl were added, mixed carefully 
and the solution was incubated for 1 h at 37°C. Following the addition of 0.2 ml of 20 % SDS solution and 
0.1 ml of Proteinase K (10 mg/ml) the tube was incubated overnight at 55 °C. 1/3 volume of saturated 
NaCl was then added and the solution was incubated for 20 min at 4'^C. The extract was pelleted in a 
microfxige (13,000 rpm) and the supernatant transferred into a new tube. The solution was extracted with 
PhOH/CHCh/IAA (25:24:1) and with CHCb/L^A (24:1). DNA was precipitated at room temperature by 
adding 0.6x volume of Isopropanol, spooled from the solution with a sterile Pasteur pipette and 
transferred into tubes containing 80% ice-cold ethanol. DNA was recovered by centrifuging the 
precipitates with 10-12,000x g, then dried on air and dissolved in ddH20. 

Preparation of small genomic DNA fragments. Genomic DNA fragments were mechanically sheared into 
fragments ranging in size between 150 and 300 bp. using a cup-horn sonicator.(Bandelin .Sonoplus UV 
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2200 sonicator equipped with a BBS cup horn, 10 sec. pulses at 100 % power output) or into fragments of 

size between 50 and 70 bp by mild DNase I treatment (Novagen). It was observed that sonication yielded 
a much tighter fragment size distribution when breaking the DNA into fragments of the 150-300 bp size 
range. However, despite extensive exposure of the DNA to ultrasonic wave-induced hydromechanical 
shearing force, subsequent decrease in fragment size could not be efficiently and reproducibly achieved. 
Therefore, fragments of 50 to 70 bp in size were obtained by mild DNase I treatment using Novagen's 
shotgun cleavage kit. A 1:20 dilution of DNase I provided with the kit was prepared and the digestion 
was performed in the presence of MnCh in a 60 pi volume at 20°C for 5 min to ensure double-stranded 
cleavage by the enzyme. Reactions were stopped with 2 \il of 0.5 M EDTA and the fragmentation 
efficiency was evaluated on a 2% TAE-agarose gel. This treatment resulted in total fragmentation of 
genomic DNA into near 50-70 bp fragments. Fragments were then blunt-ended twice using T4 DNA 
Polymerase in the presence of 100 fj.M each of dNTPs to ensure efficient flushing of the ends. Fragments 
were used immediately in ligation reactions or frozen at -20°C for subsequent use. 

Description of the vectors. The vector pMAL4.31 was constructed on a pASK-IBA backbone {Skerra, A., 
1994} with the beta-lactamase (bla) gene exchanged with the Kanamycin resistance gene. In addition bla 
gene was cloned into the multiple cloning site. The sequence encoding mature beta-lactamase is preceded 
by the leader peptide sequence of ompA to allow efficient secretion across the cytoplasmic membrane. 
Furthermore a sequence encoding the first 12 amino acids (spacer sequence) of mature beta-lactamase 
follows the ompA leader peptide sequence to avoid fusion of sequences immediately after the leader 
peptidase cleavage site, since e.g. dusters of positive charged amino acids in this region would decrease 
or abolish translocation across the cytoplasmic membrane {Kajava, A. et aL, 2000}. A Smal restriction site 
serves for library insertion. An upstream Fsel site and a downstream NotI site, which were used for 
recovery of the selected fragment, flank the Smal site. The three restriction sites are inserted after the 
sequence encoding the 12 amino acid spacer sequence in such a way that the bla gene is transcribed in the 
-1 reading frame resulting in a stop codon 15 bp after the Notl site. A +1 bp insertion restores the bla ORF 
so that beta-lactamase protein is produced with a consequent gain of Ampicillin resistance. 

The vector pMAL9.1 was constructed by cloning the lamB gene into the multiple cloning site of pEHl 
{Hashemzadeh-Bonehi, L. et al., 1998}. Subsequently, a sequence was inserted in lamB after amino acid 
154, containing the restriction sites Fsel, Smal and Notl. The reading frame for this insertion was 
constructed in such a way that transfer of frame-selected DNA fragments excised by digestion with Fsel 
and Notl from plasmid pMAL4.31 yields a continuous reading frame of lamB and the respective insert. 

The vector pMALlO.l was constructed by cloning the btuB gene into the multiple cloning site of pEHl. 
Subsequently, a sequence was inserted in btuB after amino acid 236, containing the restriction sites Fsel, 
Xbal and Notl. The reading frame for this insertion was chosen in a way that transfer of frame-selected 
DNA fragments excised by digestion with Fsel and Notl from plasmid pMAL4.31 yields a continuous 
reading frame of btuB and the respective insert. 

The vector pHIEll was constructed by cloning the fltuA gene into the multiple cloning site of pEHl. 
Thereafter, a sequence was inserted in fhuA after amino acid 405, containing the restriction site Fsel, Xbal 
and Notl. The reading frame for this insertion was chosen in a way that transfer of frame-selected DNA 
fragments excised by digestion with Fsel and Notl from plasmid pMAL4.31 yields a continuous reading 
frame oifhuA and the respective insert. 

Cloning and evaluation of the library for frame selection. Genomic S. pyogenes DNA fragments were ligated 
into the Smal site of the vector pMAL4.31. Recombinant DNA was electroporated into DHIOB 
electrocompetent E. coli cells (GIBCO BRL) and transformants plated on LB-agar supplemented with 
Kanamycin (50 |LLg/ml) and Ampicillin (50 |j.g/ml). Plates were incubated over night at 37°C and colonies 
collected for large scale DNA extraction. A representative plate was stored and saved for collecting 
colonies for colony PGR analysis and large-scale sequencing, A simple colony PGR assay was used to 
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initially determine the rough fragment size distribution as well as insertion efficiency. From sequencing 
data the precise fragment size was evaluated, junction intactness at the insertion site as well as the frame 
selection accuracy (37z+l rule). 

Cloning and evaluation of the library for bacterial surface display. Genomic DNA fragments were excised from 
the pMAL4.31 vector^ containing the S. pyogenes library with the restriction enzymes Fsel and Notl, The 
entire population of fragments was then transferred into plasmids pMAL9.1 (LamB), pMALlO.l (BtuB) or 
pHIEll (FhuA), which have been digested with fsel and Nofl. Using these two restriction enzymes, 
which recognise an 8 bp GC rich sequence, the reading frame that was selected in the pMAL4.31 vector is 
maintained in each of the platform vectors. The plasmid library was then transformed into £. coli 
DHSalpha cells by electroporation. Cells were plated onto large LB-agar plates supplemented with 50 
|ag/ml Kanam5rcin and grown over night at 37°C at a density yielding clearly visible single colonies. Cells 
were then scraped off the surface of these plates, washed with fresh LB medium and stored in aliquots for 
library screening at -80°C, 

Results 

Libraries for frame selection. Three libraries (LSPy70, LSPylSO and LSPySOO) were generated in the 
pMAL4.31 vector with sizes of approximately 70, 150 and 300 bp, respectively. For each library, ligation 
and subsequent transformation of approximately 1 (ag of pMAL4.31 plasmid DNA and 50 ng of 
fragmented genomic S. pyogenes DNA yielded 4x 10^ to 2x 10^ clones after frame selection. To assess the 
randomness of the libraries, approximately 600 randomly chosen clones of LSPy70 were sequenced. The 
bioinformatic analysis showed that of these clones only very few were present more than once. 
Furtliermore, it was shown that 90% of the clones fell in the size range between 16 and 61 bp with an 
average size of 34 bp (Figure 2). All sequences followed the 3n+l rule, showing that all clones were 
properly frame selected. 

Bacterial surface display libraries. The display of peptides on the surface of E. coli required the transfer of the 
inserts from the LSPy libraries from the frame selection vector pMAL4.31 to the display plasmids 
pMAL9.1 (LamB), pMALlO.l (BtuB) or pHIEll (FhuA). Genomic DNA fragments were excised by Fsel 
and Notl restriction and ligation of 5ng inserts with 0.1|j.g plasmid DNA and subsequent transformation 
into DH5alpha cells resulted in 2-5x 10^ clones. The clones were scraped off the LB plates and frozen 
without further amplification. 

Example 3: Identification of highly immunogenic peptide sequences from S. pyogenes using bacterial 
surface displayed genomic libraries and human serum 

Experimental procedures 

MACS screening. Approximately 2.5x 10^ cells from a given library were grown in 5 ml LB-medium 
supplemented with 50 |ag/ml Kanamycin for 2 h at 37°C. Expression was induced by the addition of 1 
mM IPTG for 30 min. Cells were washed twice with fresh LB medium and approximately 2x 10^ cells re- 
suspended in 100 i-il LB medium and transferred to an Eppendorf tube. 

10 |j.g of biotinylated, human IgGs from purified from serum was added to the cells and the suspension 
incubated over night at 4°C with gentle shaking. 900 |J.1 of LB medium was added, the suspension mixed 
and subsequently centrifuged for 10 min at 6,000 rpm at 4°C (For IgA screens, 10 |Lig of ptirified IgAs 
were used and these captured with biotinylated anti-human-IgG secondary antibodies). Cells were 
washed once with 1 ml LB and then re-suspended in 100 |j.1 LB medium. 10 |li1 of MACS microbeads 
coupled to streptavidin (Miltenyi Biotech, Germany) were added and the incubation continued for 20 min 
at 4°C. Thereafter 900 |al of LB medium was added and the MACS microbead cell suspension was loaded 
onto the equilibrated MS column (Miltenyi Biotech, Germany) which was fixed to the magnet. (The MS 
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colunins were equilibrated by washing once with 1 ml 70% EtOH and twice with 2 ml LB medium.) 

The column was then washed three times with 3 ml LB medium. After removal of the magnet, cells were 
eluted by washing with 2 ml LB medium. After washing the column with 3 ml LB medium, the 2 ml 
eluate was loaded a second time on the same column and the washing and elution process repeated. The 
loading, washing and elution process was performed a third time, resulting in a final eluate of 2 ml. 

A second round of screening was performed as follows. The cells from the final eluate were collected by 
centrifugation and re-suspended in 1 ml LB medium supplemented with 50 jag/ml Kanamycin. The 
culture was incubated at 37°C for 90 min and then induced with 1 mM IPTG for 30 min. Cells were 
subsequently collected, washed once with 1 ml LB medium and suspended in 10 [x\ LB medium. Since the 
volume was reduced, 1 ^g of human, biotinylated IgGs was added and the suspension incubated over 
night at 4°C with gentle shaking. All furtlier steps were exactly the same as in the first selection round. 
Cells selected after two rounds of selection were plated onto LB-agar plates supplemented with 50 |ag/ml 
Kanamycin and grown over night at 37°C. 

Evaluation of selected clones by sequencing and Western blot analysis. Selected clones were grown over night at 
37°C in 3 ml LB medium supplemented with 50 ^g/ml Kanamycin to prepare plasmid DNA using 
standard procedures. Sequencing was performed at MWG (Germany) or in collaboration with TIGR 
(U.S.A.). 

For Western blot analysis approximately 10 to 20 jiig of total cellular protein was separated by 10% SDS- 
PAGE and blotted onto HybondC membrane (Amersham Pharmacia Biotech, England). Tlie LamB, BtuB 
or FhuA fusion proteins were detected using human serum as the primary antibody at a dilution of 
approximately 1:5,000 and anti-human IgG or IgA antibodies coupled to HRP at a dilution of 1:5,000 as 
secondary antibodies. Detection was performed using the ECL detection kit (Amersham Pharmacia 
Biotech, England). Alternatively, rabbit anti FhuA or mouse anti LamB antibodies were used as primary 
antibodies in combination with the respective secondary antibodies coupled to HRP for the detection of 
the fusion proteins. 

Results 

Screening of bacterial surface display libraries by magnetic activated cell sorting (MACS) using biotinylated Igs, 
The libraries LSPy70 in pMAL9.1, LSPylSO in pMALlO.l and LSPy300 in pHIEll were screened with 
pools of biotinylated, human IgGs and IgAs from patient sera or sera from healthy individuals (see 
Example 1: Preparation of antibodies from human serum). The selection procedure was performed as 
described under Experimental procedures. Figure 3A shows a representative example of a screen with 
the LSPy-70 library and P4-IgGs. As can be seen from the colony count after the first selection cycle from 
MACS screening, the total number of cells recovered at the end is drastically reduced from 3x 10^ cells to 
approximately 5x 10^ cells, whereas the selection without antibodies added showed a reduction to about 
2x103 cells (Figure 3A). After the second round, a similar number of cells was recovered with P4-IgG, 
while fewer than 10 cells were recovered when no IgGs from human serum were added, clearly showing 
that selection was dependent on S. pyogenes specific antibodies. To evaluate the performance of the 
screen, approximately 50 selected clones were picked randomly and subjected to Western blot analysis 
with the same, pooled serum (Figure 3B). This analysis revealed that 70% of the selected clones showed 
reactivity with antibodies present in the relevant serum whereas the control strain expressing LamB 
without a S. pyogenes specific insert did not react with the same serum. In general, the rate of reactivity 
was observed to lie within the range of 35 to 75%. Colony PCR analysis showed that all selected clones 
contained an insert in the expected size range. 

Subsequent sequencing of a larger number of randomly picked clones (600 to 1200 per screen) led to the 
identification of the gene and the corresponding peptide or protein sequence that was specifically 
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recognized by the human serum used for screening. The frequency with which a specific clone is selected 
reflects at least in part the abundance and/or affinity of the specific antibodies in the serum used for 
selection and recognizing the epitope presented by this clone. In that regard it is striking that clones 
derived from some ORFs (e.g. Spy0433, Spy2025) were picked more than 80 times, indicating their highly 

immunogenic propert}?^. Table 1 summarizes the data obtained for all 15 performed screens. All clones 
that are presented in Table 1 have been verified by Western blot analysis using whole cellular extracts 
from single clones to show the indicated reactivity with the pool of human serum used in the respective 
screen. As can be seen from Table 1, distinct regions of the identified ORF are identified as immtmogenic, 
since variably sized fragments of the proteins are displayed on the surface by the platform proteins. 

It is further worth noticing that most of the genes identified by the bacterial surface display screen encode 

proteins that are either attached to the surface of S. -pyogenes and/or are secreted. This is in accordance 
with the expected role of surface attached or secreted proteins in virulence of S. pyogenes. 

Example 4: Assessment of the reactivity of highly immtmogenic peptide sequences with individual 
human sera. 

Approximately 100 patients and 60 healthy adult sera were included in the analysis. Following the 
bioinformatic analysis of selected clones, corresponding peptides were designed and S5n:ithesized. In case 
of epitopes with more than 28 amino acid residues, overlapping peptides were made. All peptides were 
synthesized with a N-terminal biotin-tag and used as coating reagents on Streptavidin-coated ELISA 
plates. 

The analysis was performed in two steps. First, peptides were selected based on their reactivity with the 
individual sera, which were included in the serum pools (five individual sera) used for preparations of 
IgG and IgA screening reagents for bacterial surface display. Peptides not displaying a positive reaction 
were not included in further, more detailed studies. Second, a large number of not pre-selected 
individual sera from patients with acute pharyngitis or with post-streptococcal diseases or from healthy 
adults and children were tested against the peptides showing specific and high reactivity with the 
screening sera. Antibody levels were measured by ELISA and compared by the score calculated for each 
peptide based on the number of positive sera and the extent of reactivity. An example for serum 
reactivity of 174 peptides representing S. pyogenes epitopes from the genomic screen with 20 human sera 
(representing 4 different pools of five sera) used for the antigen identification is shown in table 2. The 
peptides range from highly and widely reactive to weakly positive ones. Among the most reactive ones 
there are known antigens, some of them are also protective in animal challenge models for 
nasophar3mgeal carriage (eg. C5a peptidase and M protein). 



Example 5: Gene distribution studies with highly immunogenic proteins identified from S. pyogenes. 
Gene distribution of group A streptococcal antigens by PGR, An ideal vaccine antigen would be an antigen 
that is present in all, or the vast majority of strains of the target organism to which the vaccine is directed. 
In order to establish whether the genes encoding the identified Streptococcus pyogenes antigens occur 
ubiquitously in S. pyogenes strains, PGR was performed on a series of independent S. pyogenes isolates 
with primers specific for the gene of interest. S, pyogenes isolates were obtained covering emm types most 
frequently present in patients as shown in Figure 4A. Oligonucleotide sequences as primers were 
designed for all identified ORFs yielding products of approximately 1,000 bp, if possible covering all 
identified immunogenic epitopes. Genomic DNA of all S. pyogenes strains was prepared as described 
under Example 2. PGR was performed in a reaction volume of 25 jixl using Taq polymerase (lU)^ 200 nM 
dNTPs, 10 pMol of each oligonucleotide and the kit according to the manufacturers instructions 
(Invitrogen, The Netherlands). As standard, 30 cycles (Ix: 5min. 95''G, 30x: 30sec. 30sec. 56°C, 30sec. 
72°C, Ix 4min. 72°G) were performed, unless conditions had to be adapted for individual primer pairs. 
Results 
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AU identified genes encoding immunogenic proteins were tested by PGR for their presence in 50 different 

strains of S. pyogenes (Figure 4A). As an example, figure 4B shows the PGR reaction for Spy0269 with all 
indicated 50 strains. As clearly visible, the gene is present in all strains analysed. The PGR fragment from 
strain no 8 (M89) was sequenced and showed that of 917 bp only 2 bp are different as compared to the S. 
pxjogenes Ml strain SF310, resulting in only one amino acid difference between the two isolates. 
From a total of 96 genes analysed, 70 were present in all strains tested, while 22 genes were absent in 
more than 10 of the tested 50 strains (Table 3). Several genes (Spy0433, Spy0681) showed variation in size 
and were not present in all strain isolates. Some genes showed variation in size, but were otherwise 
conserved in all tested strains (e.g. Spyl371). Sequencing of the generated PGR fragment from one strain 
and subsequent comparison to the Ml strain confirmed the amplification of the correct DNA fragment 
and revealed a degree of sequence divergence as indicated in Table 3. Importantly, many of the identified 
antigens are well conserved in all strains in sequence and size and are therefore novel vaccine candidates 
to prevent infections by group A streptococci. 

Example 6: Characterization of immune sera obtained from mice immunised with highly immunogenic 
proteins/peptides from S. pyogenes displayed on the surface of E. colL 

Generation of immune sera from mice 

E, coli clones harboring plasmids encoding the platform protein fused to a S, pyogenes peptide, were grown in 
LB medium supplemented with 50|ag/ml Kanamycin at 37°G. Overnight cultures were diluted 1:10, grown until 
an OD600 of 0.5 and induced with 0.2 mM IPTG for 2 hours. Pelleted bacterial cells were suspended in PBS buffer 
and disrupted by sonication on ice, generating a crude cell extract. According to the ODeoo measurement, an 
aliquot corresponding to 5x10^ cells was injected into NMRI mice i.v., followed by a boost after 2 weeks. Serum 
was taken 1 week after the second injection. Epitope specific antibody levels were measured by peptide ELISA. 

In vitro expression of antigens 

Expression of antigens by in vitro grown S, pyogenes SF370/M1 was tested by immunoblotting. Different growth 
media and culture conditions were tested to detect the presence of antigens in total lysates and bacterial culture 
supernatants. Expression was considered confirmed when a specific band corresponding to the predicted 
molecular weight and electrophoretic mobility was detected. 

Cell surface staining 

Flow cytometric analysis was carried out as follows. Bacteria were grown under culture conditions, which 
resulted in expression of the antigen as shown by the immunoblot analysis. Gells were washed twice in Hanks 
Balanced Salt Solution (HBSS) and the cell density was adjusted to approximately 1 X 10^ GFU in 100|li1 HBSS, 
0.5% BSA. After incubation for 30 to 60 min at 4*='G with antisera diluted 50 to 100-fold, tmbound antibodies 
were washed away by centrifugation in excess HBSS, 0.5% BSA. Secondary goat anti-mouse antibody (F(ab')2 
fragment specific) labeled with fluorescein (FITG) was incubated with the cells at 4°C for 30 to 60 min. After 
washing the cells, antibodies were fixed with 2% paraformaldehyde. Bound antibodies were detected using a 
Becton Dickinson FAGScan flow cytometer and data further analyzed with the computer program GELLQuest. 
Gontrol sera included mouse pre-immune serum and mouse polyclonal serum generated with lysates prepared 
from IPTG induced E. coli cells transformed with plasmids encoding the genes lamB or fhuA without S. pyogenes 
genomic insert. 

Opsonophagocytosis assay 

Epitope specific immune sera were tested for their activity to induce opsonophagocytosis in a FAGS based 
assay. Sera were heat inactivated and anti-£. coli antibodies then removed by incubation with whole cell E. coli 
(3x). lO'^ Alexa 488 labeled S. pyogenes cells were pre-opsonized in the presence of 2-10% immune serum and 2% 
hamster serum as complement source and then added to 10^ phagocytic cells (RAW246.7 or P388.D1 murine 
monocytic cell lines). The cell mixture was incubated for 30 min at 370G. Time, IgG concentration and 
complement dependent uptake of bacteria was registered as an increase in mean fluorescence intensity of the 
phagocytic cells measured with a fluorescence activated cell sorter. 
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Bactericidal (killing) assay 

Murine macrophage cells (RAW246.7 or P388.D1) and bacteria were incubated and the loss of viable bacteria 
after 60 min was determined by colony cotinting. In brief, bacteria were washed twice in Hanks Balanced Salt 

Solution (HBSS) and the cell density was adjusted to approximately IX 10^ CFU in 50[xl HBSS. Bacteria were 
incubated with mouse sera (up to 25%) and guinea pig complement (up to 5%) in a total volume of lOOjal for 
60min at 4°C. Pre-opsonized bacteria were mixed with macrophages (murine cell line RAW264.7 or P388,D1; 2X 
106 cells per 100^1) at a 1:20 ratio and were incubated at 37°C on a rotating shaker at 500 rpm. An aliquot of each 
sample was diluted in sterile water and incubated for 5 min at room temperature to lyse macrophages. Serial 
dilutions were then plated onto Todd-Hewitt Broth agar plates. The plates were incubated overnight at 37°C, 
and the colonies were counted with the Countermat flash colony counter (lUL Instruments). Control sera 
included mouse pre-immune serum and mouse polyclonal serum generated with lysates prepared from IPTG 
induced E. coli transformed with plasmids harboring the genes lamB or fliiiA without S. -pyogenes genomic insert. 

Results 

In vitro expression and cell surface staining. The expression of the antigenic proteins was analyzed in vitro in S. 
pyogenes SF370/M1 by using sera raised against E. coli clones harboring plasmids encoding the platform protein 
fused to a S. pyogenes peptide. This analysis served as a first step to determine whether a protein is expressed at 
all in order to evaluate surface expression of the polypeptide by FACS analysis. It was anticipated that not all 
protein would be expressed under in vitro conditions, but several proteins were detected by Western blot 
analysis in total cell lysates (e.g. Spy0012, Spy0112, Spy0416, Spy0437, Spy0872, Spyl032, Spyl315, Spyl798; 
data not shown). Cell surface accessibility for several antigenic proteins was subsequently demonstrated by an 
assay based on flow cytometry. Streptococci were incubated with preimmune and polyclonal mouse sera raised 
against S. pyogenes lysate or E. coli clones harboring plasmids encoding the platform protein fused to a S. 
pyogenes peptide, follow by detection with f luorescently tagged secondary antibody. As shown in Fig. 5A, 
antisera raised against S. pyogenes lysate cause a shift in fluorescence of the S. pyogenes SF370/M1 cell 
population. Similar cell surface staining of S. pyogenes SF370/M1 cells was observed with polyclonal sera raised 
against peptides of antigen Spy0012 (Fig. 5B), Spyl315 and Spyl798 (Fig. 5C), although only a subpopulation of 
the bacteria was stained, as indicated by the detection of two peaks. This phenomenon may be a result of 
differential expression of the gene products during the growth of the bacterium or partial inhibition of antibody 
binding caused by other surface molecules. 

Tliese experiments confirmed the bioinformatic prediction that these proteins are exported due to their signal 
peptide sequence and in addition showed that they are anchored on the cell sttrface of S, pyogenes SF370/M1. 
They also confirm that these proteins are available for recognition by human antibodies and make them 
valuable candidates for the development of a vaccine against Group A Streptococcal disease. 

Example 7: Protective immune responses against infection with group A streptococci upon immimization 
with recombinant antigens. 

Experimental procedures 

Cloning of genes encoding antigenic proteins 

The gene or DNA fragment of interest was amplified from genomic DNA of S. pyogenes SF370 by PCR 
amplification using gene specific primers. Apart from the gene specific sequence, the primers contained 
additional bases at the respective 5' end consisting of restriction sites that aided in the directional cloning of the 
amplified PCR product. The gene specific sequence of the primer ranged between 15-24 bases in length. The 
PCR products obtained were digested with the appropriate restriction enz5nnes and cloned into the 
appropriately digested pET28b(+) vector (NOV AG EN). After confirmation of the construction of the 
recombinant plasmid, E, coli BL21 STAR® cells (INVITROGEN) that served as expression hosts were 
transformed. These cells are optimized to efficiently express the gene of interest as encoded by the pET28b 
plasmid. 
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Expression of antigens in Escherichia colt 

E. coli BL21 STAR® cells harboring the recombinant plasmid were grown into log phase in LB medium 
supplemented with 50|^g/ml Kanamycin at 37°C. Once an ODeoonm of 0.8 was reached, the culture was induced 
with 1 mM IPTG for 3 hours at 37°C. The cells were harvested by centrifugation, lysed by a combination of the 
freeze-thaw method followed by disruption of cells with the Bug-buster® reagent from NOVAGEN. The lysate 
was separated by centrifugation into soluble (supernatant) and insoluble (pellet) fractions. 

Purification of recombinant -proteins from E. coli 

Depending on the localization of the protein, different purification strategies were followed. Proteins in the 
soluble fraction were purified by binding the supernatant of the cell lysates after cell disruption to Ni-Agarose 
beads (Ni-NTA- Agarose®, QIAGEN). Due to the presence of the penta-Histidine (HIS) at the Q N or both 
termini of the expressed protein, the protein binds to Ni-agarose while other contaminating proteins are washed 
and removed from the column by washing buffer. The proteins were eluted by a solution containing 100 mM 
imidazole in the appropiate buffer. The eluate was concentrated, assayed by Bradford for protein concentration 
and analysed by SDS-PAGE and Western blot. Proteins in the insoluble fraction were purified by solubilization 
of the pellet in an appropriate buffer containing 8 M Urea. The purification was performed under denaturing 
conditions (in buffer containing 8M Urea) using the same materials and procedure as mentioned above for 
soluble proteins. The eluate was concentrated and dialyzed to remove all urea in a gradual or stepwise manner. 
The final protein solution was concentrated, analysed by SDS-PAGE and measured by Bradford method. 
Expression was considered confirmed when a specific band corresponding to the predicted molecular weight 
and electrophoretic mobility was detected. For proteins, which precipitated during dialysis due to the removal 
of the denaturing reagent urea, the insoluble inclusion bodies were washed several times and directly used for 
immunization of mice. 

Immunisatiojt ofNMRI mice with recombinant proteins and challenge with S. pyogenes API 

The immimogenicity of the proteins was assayed in an experimental animal model using NMRI mice and the S. 
pyogenes strain API as infectious agent. Ten female NMRI mice at 7-8 weeks of age were immunized with 
50|j.g/dose of recombinant protein every 2 weeks for a total of 3 doses. The initial dose was adjuvanted with 
Complete Freund's adjuvant while the remaining two doses were adjuvanted with Incomplete Freund's 
adjuvant. At the end of the immtmization the mice were bled to check the antibody titer and subsequentely 
intravenously (i.v.) challenged with a lethal dose of S. pyogenes API (5x 10^ pathogenic bacteria). The mice were 
scored for 18 to 21 days post challenge for survival. 

Results 

Expression and purification of recombinant proteins. 

Of the 31 proteins selected for recombinant protein expression, 29 proteins could be produced in E. coli to a level 
sufficient for purification. While some of the proteins could be produced as soluble protein (see Table 4), some 
proteins turned out to be insoluble {e.g. Spy416B, Spy0872) or precipitated upon dialysis, which was intended to 
remove the denaturing reagent urea after solubilization of insoluble proteins such as Spy0031, Spy0292, Spy720. 
In these cases the washed inclusion bodies were directly injected into mice for immunization. In generell, the 
affinity purification yielded a recombinant protein preparation of at least 85% purity. 

Immune responses after immunization with recombinant proteins in NMRI mice. 

Table 4 lists those antigens, which were tested in mice and showed some degree of protection in 
experimental animals. Recombinant proteins, which were also tested in the bacteremia model in animals, 
but did show not any level of protection in the described experiments are not listed here, but include 
proteins such as Spy0012, Spyl063 and Spyl494. The described bacteremia model evaluates the protective 
value of vaccine candidates against invasive disease as pathogenic bacteria are directly injected into the 
blood. Recombinant proteins, which induce antibodies capable of protection against such group A 
streptococcal infection, are considered as valuable candidates for the development of a vaccine against 
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Group A Streptococcal disease. In comparison to the positive control Spy2018 (Ml protein), which was 
previously shown to provide protection against S. pyogenes challenge, a number of antigens performed to 
a similar degree when the endpoint of the challenge experiment after 18 or 21 days (Table 4) was assessed 
(Spy0416, Spyl607 or Spy0292). Other proteins showed only a partial protective effect (Spy0720, 
Spy0872), but may prove very effective when combined with other antigens (Fig. 6). 

Surprisingly, the antigen screen had identified immunogenic epitopes predominantly in the first half of 
the two larger proteins^ Spy0416 and Spyl972. Therefore it was reasoned that the protective region may 
also be contained in the N terminal part of the protein. In case of Spy0416, both parts of the antigen were 
produced as recombinant protein (Spy0416A and Spy0416B; see Table4) and tested in animal 
experiments. The experiments showed that only the first half of the protein Spy0416 (Table 4; Spy0416A) 
provided protection in the animal model, while the second half of the protein (Spy0416B) had no 
protective effect at all, clearly delineating a smaller region within the protein as the vaccine candidate. 
For antigen Spyl972 only the first half of the full-length protein was produced as recombinant protein 
and tested in the animal model. 

Example 8: Variability of genes encoding antigenic proteins in S. pyogenes strains of various serotypes. 
Experimental procedures 

Sequencing ofPCR fragments and bioinfonnatic analysis. 

The PGR analysis of S. -pyogenes strains is described in Example 5. The sequencing of the PGR fragments 
provided an estimate of the variability of the gene and the summary of the results are listed in Table 3. The 
availability of genomic sequences from five Streptococcus pyogenes strains (SF370: Ml; MGAS8232: M18; SSI-1: 
M3; MGAS315: M3; Manfredo: M5) allowed a further assessment of the variability of the antigens. All sequences 
were aligned with the respective antigen sequence from S. pyogenes SF370 and those amino acid residues 
identified which differed from the ones in the antigenic protein from S. pyogenes SF370. Inserted or deleted 
sequences were detected in some of the antigenic proteins, but are not contained in this analysis. 

Results 

Table 5 shows all positions that were identified to be variable in the indicated antigens in one of the four 
S. pyogenes strains (MGAS8232: M18; SSI-1: M3; MGAS315: M3; Manfredo: M5) or the strain used for 
sequencing of the amplified PGR fragment (see Table 3). The bioinformatic analysis shows that some of 
the antigenic proteins are very well conserved without a single amino exchange in any of the six strains of 
serotypes Ml, M3, M5, M18 and M89. Proteins belonging to this group include Spy0103 and Spyl536, 
while the exchanges in the other antigenic proteins are more numerous in larger proteins than in smaller 
ones, as expected from the difference in size by itself. Although a variety of strains was analysed, it was 
almost never observed that a single residue was changed to more than one other amino acid in the other 
strains, A further analysis of sequences of the respective genes in a larger number of strains of varying 
serotypes, clinical indication or geographic location would certainly identify possible changes in those 
amino acid residues listed or in additional residues. 

Only one of the antigenic proteins analysed by the alignment of six gene sequences showed a 
considerable degree of variation in size (Spyl357: SF370 - 217 amino acids; MGAS8232 - 245 aa; SSI-1 - 
329 aa; MGAS315 - 329 aa; Manfredo - 279 aa). Thus it is evident, that most of the evaluated antigens are 
very well conserved in sequence as well as in size and provide promising candidates for vaccine 
development. 
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Table 1: Immunogenic proteins identified by bacterial surface display. 



PCT/EP2004/002087 



S. pyogenes 
antigenic 
protein 


Putative function 
(by homology) 


predicted inununogenic aa** 


No. of selected 
and screen 


Location of 

JlUcIlUXlcU 

immunogeni 
c region (aa) 


Seq. 
ID (DNA, 
Prot.) 


Spy0012 


Hypothetical protein 


4-44, 57-65. 67-98 101-107 109-125 131-144 146- 
159, 168-173, 181-186, 191-200, 206-213, 229-245, 
261-269, 288-301, 304-317, 323-328, 350-361, 374- 

384, 388-407, 416-425 




1-114 




Spy0019 


putative secreted 
protein (cell division 
and antibiotic 
tolerance) 


5-17, 49-64, 77-82, 87-98, 118-125, 127-140, 142-150, 
153-159, 191-207, 212-218, 226-270, 274-287, 297- 
306, 325-331, 340-347, 352-369, 377-382, 390-395 


F:2, 1:16, K:24, 
Nr:29, P:12 


29-226 


2,152 


Spy0025 


putative 

phosphoribosylformyl 
glydnamidine 
synthase n 


4-16, 20-26, 32-74, 76-87, 93-108, 116-141, 148-162, 
165-180, 206-219, 221-228, 230-236, 239-245, 257- 

268, 313-328, 330-335, 353-359, 367-375, 394-403, 
414-434, 437-444, 446-453, 456-464, 478-487, 526- 

624-643, 653-665, 667-681, 697-718, 730-748, 755- 
761, 773-794, 806-821, 823-831, 837-845, 862-877, 
879-889, 896-919, 924-930, 935-940, 947-955, 959- 
964, 969-986, 991-1002, 1012-1036, 1047-1056, 1067- 
1073, 1079-1085, 1088-1111, 1130-1135, 1148-1164, 
1166-1173, 1185-1192, 1244-1254 


D:3 


919-929 


3,153 


SpyOOSl 


putative choline 
binding protein 


5-44, 62-74, 78-83, 99-105, 107-113, 124-134, 161- 
174, 176-194, 203-211, 216-237, 241-247, 253-266, 
272-299, 323-349, 353-360 


1:3, K:3, N:3 


145-305 


4,154 


Spy0103 


putative competence 
protein 


15-39, 52-61, 72-81, 92-97 


A:8 


71-81 


5,155 


Spy0112 


putative p3n:roline 
carboxylate reductase 


13-19, 21-31, 40-108, 115-122, 125-140, 158-180, 
187-203, 210-223, 235-245 


B:4 


173-186 


6,156 


SpyOllS 


putative glutamyl- 
aminopeptidase 


5-12, 19-27, 29-39, 59-67, 71-78, 80-88, 92-104, 107- 

124, 129-142, 158-168, 185-191, 218-226, 230-243, 
256-267, 272-277, 283-291, 307-325, 331-344, 346- 
352 


A:3, C:26 


316-331 


7,157 


^pyuioo 


Hypothetical protein 


6-28, 43-53, 60-76, 93-103 


[*22 K'7 M-17 
0:31, P:5 


21-99 


8, 158 


Spy0167 


Streptolysin O 


10-30, 120-126, 145-151, 159-169, 174-182, 191-196, 
201-206, 214-220, 222-232, 254-272, 292-307, 313- 
323, 332-353, 361-369, 389-396, 401-415, 428-439, 
465-481, 510-517, 560-568 


A:118, B:14, C:18, 
D:37,F:141, G:79, 
H:92,I:97,K:123, 

L:5, M:21, N:225, 
0:230, P:265 


9-264 


9,159 


Spy0168 


Hypothetical protein 


5-29, 39-45, 107-128 


K:4, N:7 


1-112 


10, 160 
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S. pyogenes 
antigenic 
protein 


Putative function 
(by homology) 


predicted immunogenic aa** 


No. of selected 
clones per ORF 
and screen i 


Location of 
identified 
mmunogeni 
c region (aa) 


Seq. 
ID (DNA, 
Prot) 


Spy0171 


tiypothetical protein 


1-38, 42-50, 54-60, 65-71, 91-102 ] 


3:2 


>l-56 


11, 161 


Spy0183 


mutative glycine 

leafainp/r) roll TIP ARC"" 

transporter 


1-13, 19-25, 41-51, 54-62, 68-75, 79-89, 109-122, 130- 
136, 172-189, 192-198, 217-224, 262-268, 270-276, 
Z81-298, 315-324, 333-342, 353-370, 376-391 


^:6 


23-39 


12, 162 


Spy0230 


putative ABC 
transporter (ATP- 
binding protein) 


5-41, 49-58, 62-103, 117-124, 147-166, 173-194, 204- 
211, 221-229, 255-261, 269-284, 288-310, 319-325, 

348-380, 383-389, 402-410, 424-443, 467-479, 496- 
517, 535-553, 555-565, 574-581, 583-591 


C:46 


474-489 


13, 163 


Spy0269 


puLauvc ouxxci^c 

exclusion protein 


8-35, 52-57, 66-73, 81-88, 108-114, 125-131, 160-167, 
174-180, 230-235, 237-249, 254-262, 278-285, 308- 
314, 321-326, 344-353, 358-372, 376-383, 393-411, 
439-446, 453-464, 471-480, 485-492, 502-508, 523- 
529, 533-556, 558-563, 567-584, 589-597, 605-619, 
625-645, 647-666, 671-678, 690-714, 721-728, 741- 
763, 'Jeye-'JJZ, 777-7S7, 792-802, 809-823, 849-864 


A:2, B:12, D:3, 
F:ll, H:5, N:6 


37-241 
409-534 
582-604 
743-804 


14, 164 


Spy0287 


njx i&cr V cu. 

hypothetical protein 


4-17, 24-36, 38-44, 59-67, 72-90, 92-121, 126-149, 
151-159, 161-175, 197-215, 217-227, 241-247, 257- 
264, 266-275, 277-284, 293-307, 315-321, 330-337, 
345-350, 357-366, 385-416 


K:l 


202-337 


15, 165 


Spy0292 


penicillin-binding 
protein (D-alanyl-D- 
alanine car 


4-20, 22-46, 49-70, 80-89, 96-103, 105-119, 123-129, 

153-160, 181-223, 227-233, 236-243, 248-255, 261- 
269, 274-279, 283-299, 305-313, 315-332, 339-344, 
349-362, 365-373, 380-388, 391-397, 402-407 


F:2 


1-48 


16, 166 


Spy0295 


oligopeptidepermease 


18-37, 41-63, 100-106, 109-151, 153-167, 170-197, 
199-207, 212-229, 232-253, 273-297 


A:3 


203-217 


17, 167 


Spy0348 


putative 

aminodeoxychorismaf 
e lyase 


20-26, 54-61, 80-88, 94-101, 113-119, 128-136, 138- 
144, 156-188, 193-201, 209-217, 221-229, 239-244, 
251-257, 270-278, 281-290, 308-315, 319-332, 339- 
352, 370-381, 388-400, 411-417, 426-435, 468-482, 
488-497, 499-506, 512-521 


D:5,I:3,M:3,P:3 


261-273 


18, 168 


Spy0416 


putative cell envelope 
serine proteinase 


6-12, 16-36, 50-56, 86-92, 115-125, 143-152, 163-172, 
193-203, 235-244, 280-289, 302-315, 325-348, 370- 
379, 399-405, 411-417, 419-429, 441-449, 463-472, 
482-490, 500-516, 536-543, 561-569, 587-594, 620- 
636, 647-653, 659-664, 677-685, 687-693, 713-719, 
733-740, 746-754, 756-779, 792-799, 808-817, 822- 
828, 851-865, 902-908, 920-938, 946-952, 969-976, 
988-1005, 1018-1027, 1045-1057, 1063-1069, 1071- 


A:3, B:4, C:30, 
D:13,F:138, 
G:120,H:101,I:9, 
K:14, M:2, N: 15, 
0:8, P:19 


1-414 

443-614 

997-1392 


19, 169 
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S. pyogenes 
antigenic 
protein 


Putative function 
(by homology) 


predicted immunogenic aa** 


No. of selected 
clones per ORF 
and screen 


Location of 
identified 
immunogeni 
c region (aa) 


Seq. 
ID (DNA, 
Prot.) 






1078, 1090-1099, 1101-1109, 1113-1127, 1130-1137, 
1162-1174, 1211-1221, 1234-1242, 1261-1268, 1278- 
1312-1317, 1319-1326, 1345-1353, 1366-1378, 
1382-1394, 1396-1413, 1415-1424, 1442-1457, 1467- 
1474, 1482-1490, 1492-1530, 1537-1549, 1559-1576, 

1^11 








Spy0430 


hypothetical protein 


14-42, 70-75, 90-100, 158-181 


B:7, 1:10, P:18 


1-164 


20, 170 


Spy0433 


hypothetical protein 


4-21, 30-36, 54-82, 89-97, 105-118, 138-147 


A: 138, B:8, C:67, 
D:ll, E:13, F:35, 
G:10,H:5,M:8 


126-207 


21, 171 


5py0437 


Hypothetical protein 


4-21, 31-66, 96-104, 106-113, 131-142 


A:29, B;10, C:21, 

D:24, E:15 


180-204 


22, 172 


Spy0469 


putative 42 kDa 
protein 


5-23, 31-36, 38-55, 65-74, 79-88, 101-129, 131-154, 

'XVZ.iL 'I^C "too "Iflvl '^O^T o>ttr" /^r^** rtr^r^ 

156-165, 163-194, 225-237, 245-261, 264-271, 279- 
284, 287-297, 313-319, 327-336, 343-363, 380-386 


B:5,F:77,I:8, 

K:15, M:3, N:17, 
0:20 


11-197 

204-219 
258-372 


23,173 


Spy0488 


hypothetical protein 


4-20, 34-41, 71-86, 100-110, 113-124, 133-143, 150- 
158, 160-166, 175-182, 191-197, 213-223, 233-239, 
259-278, 298-322 


A:17, B:ll, C:23, 
D:12, E:4, G:4, 
H:7 


195-289 


24, 174 


Spy0515 


Putative sugar 
transferase 


4-10, 21-35, 44-52, 54-62, 67-73, 87-103, 106- 
135, 161-174, 177-192, 200-209, 216-223, 249- 
298, 304-312, 315-329 


B:5, 1:3 


12-130 


25, 175 


Spy0580 


conserved 
hypothetical protein 


10-27, 33-38, 48-55, 70-76, 96-107, 119-133, 141-147, 
151-165, 183-190, 197-210, 228-236, 245-250, 266- 
272, 289-295, 297-306, 308-315, 323-352, 357-371, 

381-390, 394-401, 404-415, 417-425, 427-462, 466- 
483, 485-496, 502-507, 520-529, 531-541, 553-570, 
577-588, 591-596, 600-610, 619-632, 642-665, 671- 

^lOO £.C\A TrtT 


C:5 


434-444 


26, 176 


Spy0621 


conserved 
hypothetical protein 


6-14, 16-25, 36-46, 52-70, 83-111, 129-138, 140-149, 
153-166, 169-181, 188-206, 212-220, 223-259, 261- 
269, 274-282, 286-293, 297-306, 313-319, 329-341, 
343-359, 377-390, 409-415, 425-i30 


C:3 


360-375 


27, 177 


Spy0630 


putaUve i'lb 
dependent N-acetyl- 
galactosamine-nC 


4-Z6, 2o-4o, 54-62, 88-121, 147-162, 164-201, 203- 
237, 245-251 


C:2 


254-260 


28, 178 


Spy0681 


hypothetical protein, 
phage associated 


12-21 Jfi-'V} fsfk-U ft7-Q'^ Q8-n9 T?R~1AQ 170-90*^ 

-i-^ ^i-t >D£.i OO /it., O/ -'O, 70-XJ.^, JL^O-X*fc^, 

209-226, 233-242, 249-261, 266-271, 273-289, 293- 
318, 346-354, 360-371, 391-400 




do9-ooZ 


2y, 1/9 


Spy0683 


putative minor capsid 
protein, phage 
associated 


11-38, 44-65, 70-87, 129-135, 140-163, 171-177, 225- 
232, 238-249, 258-266, 271-280, 284-291,* 295-300, 


B:11,D:4 


270-312 


30, 180 
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S. pyogenes 
alitigemc 
proteiii 


Putative function 
(by homology) 


predicted immunogenic aa** 


No. of selected 
clones per OKF 
and screen 


Location of 
identified 
immunogeni 
c region (aa) 


Seq. 
ID (DNA, 
Prot.) 






329-337, 344-352, 405-412, 416-424, 426-434, 436- 
455, 462-475, 478-487 








3py0702 


Hypothetical protein 


5-17, 34-45, 59-69, 82-88, 117-129, 137-142, 
158-165, 180-195, 201-206, 219-226, 241-260, 
269-279, 292-305, 312-321, 341-347, 362-381, 
396-410, 413-432, 434-445, 447-453, 482-487, 
492-499, 507-516, 546-552, 556-565, 587-604 


L:2 


486-598 


31, 181 


Spy0710 


conserved 

hypothetical protein, 
phage associated 


4-15, 17-32, 40-47, 67-78, 90-98, 101-107, 111-136, 

161-171, 184-198, 208-214, 234-245, 247-254, 272- 
279, 288-298, 303-310, 315-320, 327-333, 338-349, 
364-374 


B:10 


378-396 


32, 182 


Spy0711 


pyrogenic exotoxin C 
precursor, phage 
associated (speC) 


5-27, 33-49, 51-57, 74-81, 95-107, 130-137, 148-157, 
173-184 


K:2 


75-235 


33, 183 


Spy0720 


conserved 

hypothetical protein 


6-23, 47-53, 57-63, 75-82, 97-105, 113-122, 124-134, 
142-153, 159-164> 169-179, 181-187, 192-208, 215- 

243, 247-257, 285-290, 303-310 


D:2 


30-51 


34,184 


Spy0727 


putative DNA gyrase, 
subunit B 


17-29, 44-52, 59-73, 77-83, 86-92, 97-110, 118- 
153 156-166 173-179 192-209 225-231 234- 
240, 245-251, 260-268, 274-279, 297-306, 328- 
340, 353-360, 369-382, 384-397, 414-423, 431- 
436, 452-465, 492-498, 500-508, 516-552, 554- 
560, 568-574, 580-586, 609-617, 620-626, 641- 
647 


M:26 


208-219 


35,185 


Spy0737 


putative extracellular 
matrix binding 
protein 


4-26, 32-45, 58-72, 111-119, 137-143, 146-159, 187- 
193, 221-231, 235-242, 250-273, 290-304, 311-321, 
326-339, 341-347, 354-368, 397-403, 412-419, 426- 
432, 487-506, 580-592, 619-628, 663-685, 707-716, 
743-751, 770-776, 787-792, 850-859, 866-873, 882- 
888, 922-931, 957-963, 975-981, 983-989, 1000-1008, 
1023-1029, 1058-1064, 1089-1099, 1107-1114, 1139- 
1145, 1147-1156, 1217-1226, 1276-1281, 1329-1335, 
1355-1366, 1382-1394, 1410-1416, 1418-1424, 1443- 
1451, 1461-1469, 1483-1489, 1491-1501, 1515-1522, 
1538-1544, 1549-1561, 1587-1593, 1603-1613, 1625- 
1630, 1636-1641, 1684-1690, 1706-1723, 1765-1771, 
1787-1804 1850-18*57 1863-1 8Q4 1897-1910 1Q26- 

-L/C' ±.\J\J'Xf X\J*J\J X%J<J/ f J.O\JiJ X07^/ XXJ^S XWX\Mf X^AAJ 

1935, 1937-1943, 1960-1983, 1991-2005, 2008-2014, 

2018-2039 


B:5, E:3, K:ll 


396-533 

1342-1502 

1672-1920 


36, 186 


Spy0747 


extracellular nuclease 


4-25, 45-50, 53-65, 79-85, 87-92, 99-109, 126-137, 
141-148, 156-183, 190-203, 212-217, 221-228, 235- 


A:72,B:17,H:6, 
0:3 


1-113 
210-232 


37, 187 
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S. jnfogettes 


Putative function 


predicted immunogenic aa*"*^ 


No. of selected 


Location of 


Seq. 


antigenic 


(by homology) 




clones per ORF 


identified 


Prot.) 


protein 






and screen 


immunogeni 

c region (aa) 






242, 247-277, 287-293, 300-319, 321-330, 341-361, 




250-423 








378-389, 394-406, 437-449, 455-461, 472-478, 482- 




536-564 








491, 507-522, 544-554, 576-582, 587-593, 611-621, 












626-632, 649-661, 679-685, 696-704, 706-716, 726- 












736, 740-751, 759-766, 786-792, 797-802, 810-822, 












824-832, 843-852, 863-869, 874-879, 882-905 












4-16, 33-39, 43-49, 54-85, 107-123, 131-147, 157-169, 


C:4, E:2 


617-635 


38, 188 






177-187, 198-209, 220-230, 238-248, 277-286, 293- 












301, 303-315, 319-379, 383-393, 402-414, 426-432, 












439-449, 470-478, 483-497, 502-535, 552-566, 571- 












582, 596-601, 608-620, 631-643, 651-656, 663-678, 












680-699, 705-717, 724-732, 738-748, 756-763, 766- 












772, 776-791, 796-810, 819-827, 829-841, 847-861, 








Spy0777 


putative ATP- 
dependent 
exonuclease/ subunit 
A 


866-871, 876-882, 887-894, 909-934, 941-947, 957- 
969, 986-994, 998-1028, 1033-1070, 1073-1080, 1090- 
1096, 1098-1132, 1134-1159, 1164-1172, 1174-1201 








Spy0789 


putative ABC- 
transporter (permease 
protein 


7-25, 30-40, 42-64, 70-77, 85-118, 120-166, 169-199, 
202-213,222-244 


A:3 


190-203 


39, 189 






4-11, 15-53, 55-93, 95-113, 120-159, 164-200, 210- 


A:7,D:2 


385-398 


40, 190 






243, 250-258, 261-283, 298-319, 327-340, 356-366, 












369-376, 380-386, 394-406, 409-421, 425-435, 442- 








Spy0839 


putative 

r phosphodieste 


454, 461-472, 480-490, 494-505, 507-514, 521-527, 
533-544, 566-574 












5-36, 66-72, 120-127, 146-152, 159-168, 172-184, 


A:ll, B:3,C:5, 


12-190 


41, 191 






205-210, 221-232, 234-243, 251-275, 295-305, 325- 


D:4,F:50,H:19, 


276-283 








332, 367-373, 470-479, 482-487, 520-548, 592-600, 


G:49, 1:112, K:102, 


666-806 








605-615, 627-642, 655-662, 664-698, 718-725, 734- 


L:10, M:3, N:213, 










763, 776-784:, 798-809, 811-842, 845-852, 867-872, 


0:188, P:310 






Spy0843 


cell surface protein 


879-888, 900-928, 933-940, 972-977, 982-1003 












4-38, 63-68, 100-114, 160-173, 183-192, 195-210, 


A:6, D:2, F:5, 


30-80 


42, 192 






212-219, 221-238, 240-256, 258-266, 274-290, 301- 


H:14,I:9,K:10, 


89-105 








311, 313-319, 332-341, 357-363, 395-401, 405-410, 


L:1,N:16,0:12 


111-151 








420-426, 435-450, 453-461, 468-475, 491-498, 510- 








Spy0872 


putative secreted 5'- 
nudeotidase 


518, 529-537, 545-552, 585-592, 602-611, 634-639, 

650-664 








Spy0895 


histidine protein 
kinase 


7-29, 31-39, 47-54, 63-74, 81-94, 97-117, 122-127, 
146-157, 168-192, 195-204, 216-240, 251-259 


C:ll 


195-203 


43, 193 


Spy0972 


putative terminase, 
large subunit - phage 


5-16, 28-34, 46-65, 79-94, 98-105, 107-113, 120-134, 


B:2 


32-50 


44,194 
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S, pyogenes 
antigenic 
protein 


Putative function 
(by homology) 


predicted immunogenic aa** 


No. of selected 
clones per ORF 
and screen 


Location of 
identified 
immunogeni 
c region (aa) 


Seq. 
ID (DNA, 
Prot.) 






147-158, 163-172, 180-186, 226-233, 237-251, 253- 
259, 275-285, 287-294, 302-308, 315-321, 334-344, 

O.^C\ ^"71 OOO /in ylOA ytO^ 

joU-o/l, cryzi-*±l.Z.f 4zU-4Zo 








5py0981 


hypothetical protein - 
phage associated 


8-20, 30-36, 71-79, 90-96, 106-117, 125-138, 141-147, 
166-174 


A:7,B:2 


75-90 


45, 195 


SpylOOS 


streptococcal exotoxin 
H precursor (speH) 


4-13, 15-33, 43-52, 63-85, 98-114, 131-139, 146-174, 

186-192, 198-206, 227-233 


C:ll 


69-88 


46, 196 


Spyl032 


extracellular 
hyaluronate lyase 


4-22, 29-35, 59-68, 153-170, 213-219, 224-238, 240- 
246, 263-270, 285-292, 301-321, 327-346, 356-371, 
389-405, 411-418, 421-427, 430-437, 450-467, 472- 
477, 482-487, 513-518, 531-538, 569-576, 606-614, 
637-657, 662-667, 673-690, 743-753, 760-767, 770- 
777, 786-802 


B:3,K:3,M:5 


96-230 

361-491 

572-585 


47, 197 


Spyl054 


putative collagen-like 
protein (SclC) 


4-12, 21-36, 48-55, 74-82, 121-127, 195-203, 207-228, 
247-262, 269-278, 280-289 


A:71, B:13, C:233, 
D:41, E:163, 
F:200, G:442, 
H:129, N:3 


102-210 


48, 198 


Spyl063 


putative periplasmic- 
iron-binding protein 


13-20, 23-31, 38-44, 78-107, 110-118, 122-144, 151- 
164, 176-182, 190-198, 209-216, 219-243, 251-256, 
289-304, 306-313 


A:4 


240-248 


49, 199 


Spyll62 


putative ribonuclease 

Hn 


5-26, 34-48, 57-77, 84-102, 116-132, 139-145, 150- 
162, 165-173, 176-187, 192-205, 216-221, 234-248, 
250-260 


B:3, C:5 


182-198 


50, 200 


5pyl206 


putative ABC 
transporter 


10-19, 26-44, 53-62, 69-87, 90-96, 121-127, 141-146, 
148-158, 175-193, 204-259, 307-313, 334-348, 360- 
365, 370-401, 411-439, 441-450, 455-462, 467-472, 
488-504 


A:2 


41-56 


51, 201 


Spyl228 


Putative lipoprotein 


5-21, 36-42, 96-116, 123-130, 138-144, 146-157, 
184-201, 213-228, 252-259, 277-297, 308-313, 
318-323,327-333 


M:33 


202-217 


52, 202 


Spyl245 


putative phosphate 
ABC transporter 


6-26, 33-51, 72-90, 97-131, 147-154, 164-171, 
187-216, 231-236, 260-269, 275-283 


D3,K:3 


1-127 


53,203 


Spyl315 


hypothetical protein 


4-22, 24-38, 44-58, 72-88, 99-108, 110-117, 123-129, 
131-137, 142-147, 167-178, 181-190, 206-214, 217- 
223, 271-282, 290-305, 320-327, 329-336, 343-352, 
354-364, 396-402, 425-434, 451-456, 471-477, 485- 
491, 515-541, 544-583, 595-609, 611-626, 644-656, 
660-681, 683-691, 695-718 


B:4 


297-458 


54, 204 


Spyl357 


protein GRAB 
(protein G-related 


5-43, 92-102, 107-116, 120-130, 137-144, 155-163, 


G:27,H:8,K:2, 


24-135.- 


55, 205 
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alpha 2M-bindmg p 


169-174, 193-213 


N:4 






SPyl361 


putative intemalin A 
precursor 


4-25, 61-69, 73-85, 88-95, 97-109, 111-130, 135-147, 
150-157, 159-179, 182-201, 206-212, 224-248, 253- 
260, 287-295, 314-331, 338-344, 365-376, 396-405, 
413-422, 424-430, 432-449, 478-485, 487-494, 503- 

517, 522-536, 544-560, 564-578, 585-590, 597-613, 
615-623, 629-636, 640-649, 662-671, 713-721 


F:21, G:26, H:6, 
K:4, N:5 


176-330 


56, 206 


Spyl371 


putative NADP- 

dependent 

glyceraldehyde-3- 

phosphate 

dehydrogenase 


31-37, 41-52, 58-79, 82-105, 133-179, 184-193, 199- 
205, 209-226, 256-277, 281-295, 297-314, 322-328, 
331-337, 359-367, 379-395, 403-409, 417-432, 442- 
447, 451-460, 466-472 


D:14, H:3 


46-62 
296-341 


57, 207 


Spyl375 


putative 
ribonucleotide 
reductase alpha-c 


23-29, 56-63, 67-74, 96-108, 122-132, 139-146, 152- 

159, 167-178, 189-196, 214-231, 247-265, 274-293, 
301-309, 326-332, 356-363, 378-395, 406-412, 436- 
442, 445-451, 465-479, 487-501, 528-555, 567-581, 
583-599, 610-617, 622-629, 638-662, 681-686, 694- 
700, 711-716 


A:2 


667-684 


58, 208 


Spyl389 


putative alanyl-tRNA 
synthetase 


20-51, 53-59, 109-115, 140-154, 185-191, 201-209, 
212-218, 234-243, 253-263, 277-290, 303-313, 327- 

337, 342-349, 374-382, 394-410, 436-442, 464-477, 
486-499, 521-530, 536-550, 560-566, 569-583, 652- 
672, 680-686, 698-704, 718-746, 75B-770, 774-788, 

QAO QT7 QOC Q/IO Q^l Q£a 

oUz-o27, o3o-842, ool-o69 


B:2, P:3 


258-416 


59, 209 


Spyl390 


putative protease 
maturation protein 


7-25, 39-45, 59-70, 92-108, 116-127, 161-168, 202- 

Ti 1 oiT oor? ooo ooQ ncyi T7"i o'To ooi or\r\ 
Zll, Z17~ZZ7, 2Zy-23y, 254-262, 2/l-27o, 291-300 


A:3,B:2,D:3 


278-295 


60, 210 


Spyl422 


putative 

recombination protein 


4-20, 27-33, 45-51, 53-62, 66-74, 81-88, 98-111, 124- 
130, 136-144, 156-179, 183-191 


C:2 


183-195 


61, 211 


Spyl436 


putative 

deoxyribonuclease 


12-24, 27-33, 43-49, 55-71, 77-85, 122-131, 168-177, 

179-203, 209-214, 226-241 


K:l 


63-238 


62, 212 


Spyl494 


hypotlietical protein 


4-19, 37-50, 120-126, 131-137, 139-162, 177-195, 
200-209, 211-218, 233-256, 260-268, 271-283, 288- 

308 


G:3, 1:5, K:6, M:5, 
N:10, 0:6, P:4 


1-141 


63, 213 


Spyl523 


cell division protein 


11-17, 40-47, 57-63, 96-124, 141-162, 170-207, 223- 
Z.ODf z^l-zoD, z/l-z//, zoi-oUU, oiz-olo, OZ./-OOdf 

373-379 


1:2 


231-368 


64, 214 


Spyl536 


conserved 
hypothetical protein 


9-33, 41-48, 57-79, 97-103, 113-138, 146-157, 165- 
186, 195-201, 209-215, 223-229, 237-247, 277-286, 
290-297,328-342 


A:19, C:3 


247-260 


65, 215 
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consGrvsd 

hypothetical protein 


7-15, 39-45, 58-64, 79-84, 97-127, 130-141, 163-176, 


C:4 


64-72 


66, 216 


3pyl564 


195-203, 216-225, 235-247, 254-264, 271-279 












4-12, 26-42, 46-65, 73-80, 82-94, 116-125, 135-146, 


B:2, K:2 


222-362 


67, 217 






167-173, 183-190, 232-271, 274-282, 300-306, 320- 




756-896 








343, 351-362, 373-383, 385-391, 402-409, 414-426, 












434-455, 460-466, 473-481, 485-503, 519-525, 533- 












542, 554-565, 599-624, 645-651, 675-693, 717-725, 












751-758, 767-7^5, 792-797, 801-809, 819-825, 831- 










conserved 


836, 859-869, 890-897 








Spyl604 


hypothetical protein 












11-17, 22-28, 52-69, 73-83, 86-97, 123-148, 150-164, 


D:5 


153-170 


68, 218 






166-177, 179-186, 188-199, 219-225, 229-243, 250- 










conserved 


255 








Spyl607 


hypothetical protein 












4-61, 71-80, 83-90, 92-128, 133-153, 167-182, 184- 


C:4 


56-73 


69, 219 


Spyl615 


putative late 
competence protein 


192, 198-212 












4-19, 26-37, 45-52, 58-66, 71-77, 84-92, 94-101, 107- 


D:2 


298-312 


70, 220 






118, 120-133, 156-168, 170-179, 208-216, 228-238, 








Spyl666 


conserved 

hypothetical protein 


253-273, 280-296, 303-317, 326-334 










hypothetical protein 


7-13, 27-35, 38-56, 85-108, 113-121, 123-160, 163- 


B:5 


141-157 


71, 221 


Spyl727 


169, 172-183, 188-200, 206-211, 219-238, 247-254 












23-39, 45-73, 86-103, 107-115, 125-132, 137-146, 


D:3 


433-440 


72,222 






148-158, 160-168, 172-179, 185-192, 200-207, 210- 




572-593 








224, 233-239, 246-255, 285-334, 338-352, 355-379, 












383-389, 408-417, 423-429, 446-456, 460-473, 478- 










putative ATP- 


503, 522-540, 553-562, 568-577, 596-602, 620-636, 








Spyl785 


helicase 


640-649, 655-663 












4-42, 46-58, 64-76, 118-124, 130-137, 148-156, 164- 


A:12,I:12,K:7, 


17-319 


73,223 






169, 175-182, 187-194, 203-218, 220-227, 241-246, 


N:17, 0:13, P:8 


417-563 








254-259, 264-270, 275-289, 296-305, 309-314, 322- 












334, 342-354, 398-405, 419-426, 432-443, 462-475, 












522-530, 552-567, 593-607, 618-634, 636-647, 653- 












658, 662-670, 681-695, 698-707, 709-720, 732-742, 












767-792, 794-822, 828-842, 851-866, 881-890, 895- 












903, 928-934, 940-963, 978-986, 1003-1025, 1027- 












iH/io mc:Q in'7c ir\Qr\ iaq'7 inoc 11/10 ii*fic 1100 
iU43, 1UJ30-1U/0, lUoU-lUo/, lUyo-llU9, 1115-1122, 












1133-1138, 1168-1174, 1179-1186, 1207-1214, 1248- 








Spyl798 


hypotlietical protein 


1267 










immunogenic 


6-19, 23-33, 129-138, 140-150, 153-184, 190-198, 


H:2,I:8,K:6,N:11 


46-187 


74, 224 


SpylSOl 


secreted protein 
precursor homolog 


206-219, 235-245, 267-275, 284-289, 303-310, 322- 
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328, 354-404, 407-413, 423-446, 453-462, 467-481, 












491-500 












4-34, 39-57, 78-86, 106-116, 141-151, 156-162, 165- 


[:16,K:12,N:6 


21-244 


75, 225 






172, 213-237, 252-260, 262-268, 272-279, 296-307, 




381-499 








332-338, 397-403, 406-416, 431-446, 448-453, 464- 




818-959 








470, 503-515, 519-525, 534-540, 551-563, 578-593, 












646-668, 693-699, 703-719, 738-744, 748-759, 771- 












777, 807-813, 840-847, 870-876, 897-903, 910-925, 








Spyl813 


hypothetical protein 


n/TT r\ri^ r\fic\ e\e\r\ 

967-976, 979-992 










putative translation 


19-29, 65-75, 90-109, 111-137, 155-165, 169-175 


C:6 


118-136 


76, 226 


5pyl821 


elongation factor EF-P 














15-20, 30-36, 55-63, 73-79, 90-117, 120-127, 136-149, 


C:8 


147-155 


77, 227 






166-188, 195-203, 211-223, 242-255, 264-269, 281- 












287, 325-330, 334-341, 348-366, 395-408, 423-429, 








Spyl916 


beta-D-galactosidase 


436-444,452-465 












11-18, 21-53, 77-83, 91-98, 109-119, 142-163, 173- 


A:6, 1:2, K:5, N:9 


74-438 


78, 228 






181, 193-208, 216-227, 238-255, 261-268, 274-286, 












290-297, 308-315, 326-332, 352-359, 377-395, 399- 












406, 418-426, 428-438, 442-448, 458-465, 473-482, 












488-499, 514-524, 543-553, 564-600, 623-632, 647- 












654, 660-669, 672-678, 710-723, 739-749, 787-793, 












820-828, 838-860, 889-895, 901-907, 924-939, 956- 












962, 969-976, 991-999, 1012-1018, 1024-1029, 1035- 








5pyl972 


PuUulanase 


1072, 1078-1091, 1142-1161 












4-31, 41-52, 58-63, 65-73, 83-88, 102-117, 123-130, 


1:6, M:3, N:10 


156-420 . 


79, 229 






150-172, 177-195, 207-217, 222-235, 247-253, 295- 








Spyl979 


streptokinase A 
precursor 


orvc o*fc ooo OOE oyio ocrt o^c oon on/i Af\A Atn 

305, 315-328, 335-342, 359-365, 389-394, 404-413 












4-42, 56-69, 98-108, 120-125, 210-216, 225-231, 276- 


A:81,B:24,F:19, 


79-348 


80,230 


Spyl983 


collagen-like surface 
protein (SclD) 


285, 304-310, 313-318, 322-343 


G:41, 1:2, K:2 










12-21, 24-30, 42-50, 61-67, 69-85, 90-97, 110-143, 


D:2 


53-70 


81, 231 


Spyl991 


anthranilate synthase 
component n 


155-168 












4-26, 41-54, 71-78, 88-96, 116-127, 140-149, 151-158, 


B:3, N:2 


183-341 


82, 232 






161-175, 190-196, 201-208, 220-226, 240-247, 266- 












281, 298-305, 308-318, 321-329, 344-353, 370-378, 












5Qy| /inc /I1Q >10^ yfOO AA*^ AW7 A£J>. AOA Kr\K A 
Jo4-4UO, 41o-4iiiO, '±/^-'y±£.t ^y^rOUO, Oi4r 








Spy2000 


surface lipoprotein 


522 












4-27, 69-77, 79-101, 117-123, 126-142, 155-161, 171- 


A:15, B:9, C:5, 


92-231 


83, 233 






186, 200-206, 213-231, 233-244, 258-263, 269-275, 


D:3, F:18, G:25, 


618-757 




Spy2006 


hypothetical protein 


315-331, 337-346, 349-372, 376-381, 401-410, 424- 


H:5, M:10, N:5 
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445, 447-455, 463-470, 478-484, 520-536, 546-555, 












558-569, 580-597, 603-618, 628-638, 648-660, 668- 












583, 717-723, 765-771, 781-788, 792-806, 812-822 












11-47, 63-75, 108-117, 119-128, 133-143, 171-185, 


B:2,I:7,K:7,P:2 


41-170 


84,234 






190-196, 226-232, 257-264, 278-283, 297-309, 332- 








Spy2009 


hypothetical protem 


338, 341-346, 351-358, 362-372 












6-26, 50-56, 83-89, 108-114, 123-131, 172-181, 194- 


A:47, B:10, D:3, 


20-487 


85, 235 






200, 221-238, 241-259, 263-271, 284-292, 304-319, 


F:48, G:20, H:4, 


757-1153 








321-335, 353-358, 384-391, 408-417, 424-430, 442- 


E:6, K:13, M:5, 










448, 459-466, 487-500, 514-528, 541-556, 572-578, 


NrlO, P:6 










595-601, 605-613, 620-631, 634-648, 660-679, 686- 












693, 702-708, 716-725, 730-735, 749-755, 770-777, 












805-811, 831-837, 843-851, 854-860, 863-869, 895- 












901, 904-914, 922-929, 933-938, 947-952, 956-963, 








Spy2010 


C5A peptidase 
precursor 


1000-1005, 1008-1014, 1021-1030, 1131-1137, 1154- 

■t-*x'jt -t«*^^ -fl-fPTjf 

1164, 1166-1174 












10-34, 67-78, 131-146, 160-175, 189-194, 201-214, 


A:ll, B:38,C:16, 


26-74 


86, 236 






239-250, 265-271, 296-305 


F:56, G:27, H:13, 


91-100 




Spy2016 


inhibitor of 
complement (Sic) 




K:5,N:2,0:3, 
P:14 


105-303 








9-15, 19-32, 109-122, 143-150, 171-180, 186-191, 


A:316, B:26, 


10-223 


87, 237 






209-217, 223-229, 260-273, 302-315, 340-346, 353- 


C:107, D:12, E:49, 


231-251 








359, 377-383, 389-406, 420-426, 460-480 


F:88,G:118,H:6, 


264-297 




Spy2018 


Ml-Protein 




1:7, K:2, M:48, N:4 


312-336 








5-28, 76-81, 180-195, 203-209, 211-219, 227-234, 


F:7,G:16,H:7, 


22-344 


88, 238 






242-252, 271-282, 317-325, 350-356, 358-364, 394- 


K:63,L:2,N:18, 






5py2025 


immunogenic 
secreted protein 
precursor 


400, 405-413, 417-424, 430-436, 443-449, 462-482, 
488-498, 503-509, 525-537 


0:42 










5-28, 42-54, 77-83, 86-93, 98-104, 120-127, 145-159, 


1:15, K:3, N:12 


1-151 


89, 239 






166-176, 181-187, 189-197, 213-218, 230-237, 263- 








Spy2039 


pyrogenic exotoxin B 


271, 285-291, 299-305, 326-346, 368-375, 390-395 












6-34, 48-55, 58-64, 84-101, 121-127, 143-149, 153- 


K:l 


91-263 


90,240 






159, 163-170, 173-181, 216-225, 227-240, 248-254, 












275-290, 349-364, 375-410, 412-418, 432-438, 445- 












451, 465-475, 488-496, 505-515, bSS-bS^, 571-579, 












585-595, 604-613, 626-643, 652-659, 677-636, 688- 








Spy2043 


mitogenic factor MFl 
(speF) 


696, 702-709, 731-747, 777-795, 820-828, 836-842, 
845-856, 863-868, 874-882, 900-909, 926-943, 961- 
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976, 980-986, 992-998, 1022-1034, 1044-1074, 1085- 
1096, 1101-1112, 1117-1123, 1130-1147, 1181-1187, 
1204-1211, 1213-1223, 1226-1239, 1242-1249, 1265- 
1271, 1273-1293, 1300-1308, 1361-1367, 1378-1384, 
1395-1406, 1420-1428, 1439-1446, 1454-1460, 1477- 
1487, 1509-1520, 1526-1536, 1557-1574, 1585-1596, 
1605-1617, 1621-1627, 1631-1637, 1648-1654, 1675- 
1689, 1692-1698, 1700-1706, 1712-1719, 1743-1756 








Spy2059 


penicillin-binding 
protein 2a 


4-16, 75-90, 101-136, 138-144, 158-164, 171-177, 
191-201, 214-222, 231-241, 284-290, 297-305, 311- 
321, 330-339, 352-369, 378-385, 403-412, 414-422, 

428-435, 457-473, 503-521, 546-554, 562-568, 571- 
582, 589-594, 600-608, 626-635, 652-669, 687-702, 
706-712, 718-724, 748-760, 770-775 


D:2, E:2 


261-272 


91, 241 


Spy2110 


putative anaerobic 
ribonucleoside- 
triphosphate 
reductase 


4-19, 30-41, 46-57, 62-68, 75-92, 126-132, 149-156, 
158-168, 171-184, 187-194, 210-216, 218-238, 245- 
253, 306-312, 323-329, 340-351, 365-373, 384-391, 
399-405, 422-432, 454-465, 471-481, 502-519, 530- 

541, 550-562, 566-572, 576-582, 593-599, 620-634, 
637-643, 645-651, 657-664, 688-701 


E:7 


541-551 


92,242 


Spy2127 


Hypotlietical protein 


6-11, 17-25, 53-58, 80-86, 91-99. 101-113, 123- 
131, 162-169, 181-188, 199-231, 245-252 


1:6, P:2 


84-254 


93,243 


Spy2191 


hypothetical protein 


13-30, 71-120, 125-137, 139-145, 184-199 


C:20, E:3, M:5 


61-78 


94,244 


Spy2211 


transmembrane 
protein 


9-30, 38-53, 63-70, 74-97, 103-150, 158-175, 183-217, 

225-253, 260-268, 272-286, 290-341, 352-428, 434- 
4oU, 453-460, 469-478, 513-525, 527-534, 554-563, 
586-600, 602-610, 624-640, 656-684, 707-729, 735- 
749, 757-763, 766-772, 779-7SS, 799-805, 807-815, 
819-826, 831-855 


A:3 


568-580 


95,245 














ARF0450 


no homology 


11-21, 29-38 


A:ll 


5-17 


96, 246 


ARF0569 


no homology 


noiiG 


A;2 


7-0 




AKF0694 


no homology 


4-10 16-28 


R*7 T\*'^ A/r*'^ 


7-1 R 
/-lO 

26-34 


QQ O/IQ 


AIIF0700 


No homology 


10-16 


M:ll 


1-15 


99, 249 


ARP1007 


No homology 


none 


B:2 


4-11 


100, 250 


ARP1145 


No homology 


4-40, 42-51 


C:9 


37-53 


101, 251 


ARP1208 


no homology , 


4-21 


C:l 


22-29 


102, 252 
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and screen 


Location of 
identified 
immuno geni 

C TPCTinTI 

\. xcg,i.uxL \a.a.f 


Seq. 
ID (DNA, 
Prot.) 


ARF1262 


No homology 


none 


D:2 


2-11 


103, 253 


ARF1294 


Ol? /O vvllll ^r%\J LOX 

(first 28 aaof67 aa 
protein) 


0-17 yy-AA 

7-3./ 1 O^r^tfk 




1-99 




AEF1316 


no homology 


19-25 27-'^2 


E:19 


15-34 




ARF1352 


38% with SA1142 (aa 
265-295 of 358 
protein) 


4-12, 15-22 


D:4 


11-33 


106, 256 


ARF1481 


No homology 


10-17 94-'^n '^Q-4fi 'SI -70 




51-61 


107 257 


ARF1557 


No homology 




C:2 


6-19 


inR 9RS 


ARF1629 


36% with SP0069 (aa 
139-169 of 211 aa 
protein) 


6-11, 21-27, 31-54 


A:4, B:6 


11-29 


109, 259 


ARF1654 


no homology 


4-10, 13-45 


A:2 


11-35 


110, 260 


ARF2027 


no homology 


4-14, 23-32 


D:2 


11-35 


111, 261 


AKF2093 


putative elongation 
factor TS 


14-39,45-51 


C:3 


15-29 


112, 262 


ARF2207 


38%withSP1006(aa 
7-37 of 67 aa protein) 


4-11, 14-28 


A:117 


4-17 


113, 263 


CRF0038 


No homology 


4-16 


C:6 


2-16 


114, 264 


CRF0122 


No homology 


4-10, 12-19, 39-50 


C:2 


6-22 


115, 265 


CRF0406 


no homology 


none 


D:5, E:ll 


2-13 


116, 266 


CRF0416 


No homology 


4-11, 22-65 


C:42 


3-19 


117, 267 


CKF0507 


No homology 


17-23, 30-35, 39-46, 57-62 


B:3, C:4 


30-49 


118, 268 


CRF0549 


No homology 


4-19 


C:6 


14-22 


119, 269 


CRF0569 


No homology 


none 


N:35 


2-9 


120, 270 


CRF0628 


34% (14 of 41) with 
conserved 
hypothetical protein 
of P. aeruginosa 


7-18,30-43 


A:3 


4-12 


121, 271 


CRP0727 


40% (16 of 40) with 
transcriptional 
regulator of S. 

pncLIIIlOmac cla/ 

SP0584) 


4-30, 39-47 


N:6 


5-22 


122, 272 


CRP0742 


33% with SA0422 (aa 
11-37 of 42 aa protein, 
listed as 280 aa 
protein) 


6-15 


D:7, E:12 


14-29 


123, 273 


CRF0784 


No homology 


4-34 


N:9 


23-35 


124, 274 


CKF0854 


No homology 


4-36, 44-57, 65-72 


N:14 


14-27 


125, 275 


CRP0875 


no homology 


4-18 


A:4, D:l 


11-20 


126, 276 


CRF0907 


Homology to 
lysosomal trafficking 
regulator LYST 
[Homo sapiens] 


none 


A:39 


5-19 


127, 277 
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S. pyogenes 
antigenic 


Putative function 
(by homology) 


predicted immunogenic aa** 


No. of selected 
clones per ORF 


Location of 
identified 

iiiiiiiiiiii i»^rri ■ ■ 

c region (aa) 


Seq. 
ID (DNA, 
Prot.) 


CRF0979 


no homology 


18-36 


D:21 


6-20 


128, 278 


CRF1068 


no homology 


4-10, 19-34, 41-84> 96-104 


C:1,D:3 


50-63 


129, 279 


CRF1152 


No homology 


4-9, 19-27 


C:15 


8-21 


130, 280 


CRF1203 


No homology 


4-16, 18-28 


N:3 


22-30 


131, 281 


CRP1225 


No homology 


4-15 


C:8 


21-35 


132, 282 


CRF1236 


No homology 


4-17 


N:3 


3-13 


133, 283 


CRP1362 


No homology 


4-12 


C:6 


4-18 


134, 284 


CRF1524 


no homology 


4-24 31-36 


D:3 


29-45 


135 285 


aRB1525 


No homology 


12-22 34-49 


C:2 


21-32 


136, 286 


CRF1527 


no homology 


4-17 


D:4, E:l 


22-32 


137, 287 


CRF1588 


No homology 


4-16/ 25-42 


C:2 


7-28 


138, 288 


CRF1649 


No homology 


4-10 


C:3 


7-20 


139, 289 


CRF1749 


No homology 


4-11, 16-36, 39-54 


C:15 


28-44 


140, 290 


CRF1903 


no homology 


5-20, 29-54 


A:14 


14-29 


141, 291 


CRF1964 


no homology 


24-33 


A:8 


10-22 


142, 292 


CRF2055 


no homology 


10-51, 54-61 


B:l, F:12, H:14 


43-64 


143, 293 


CKF2091 


No homology 


7-13 


C:2 


2-17 


144, 294 


CRF2096 


No homology 


11-20 


C:4 


6-20 


145,295 


CRF2104 


No homology 


4-^0, 34-41 


C:2 


19-28 


146, 296 


CRF2116 


No homology 


n,d. 




11-21 


147, 297 


CRF2153 


no homology 


4-16, 21-26 


F:2 


9-38 


148, 298 


NRFOOOl 


ARF in Oligo ABC 
transporter (not 
annotated by HGR), 
33% with SA0643(aa 

107-162 of 469 aa 
protein) 


4-12, 15-27, 30-42, 66-72 


A:7, B:l 


10-24 


149, 299 


NRB0003 


no homology 


8-17 


A:23 


11-20 


150, 300 
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Table 2-1 



Peptide 


Peptide sequence 


Seq.lD 


location In proteli 


SPA0450.1 


SRFLPTRBDlfSSLWSftSC 


246 


2-19 


SPA05e9.1 


SETOEKE^IE'BGS 


247 


1-12 


SPA06342 


KTSOTIPTKBOKMBRIMr 


248 


21-38 


SPA1294.1 


MLKMOCIWSKLVJIjSQETKKF 


254 


2-22 


SPA1316.1 


GRTRHDHWXTfSPNGICSP 


255 


15-33 


SPA135Z1 


QPKHKEX2FVLI<14LKMYESI<KQI 


256 


11-32 


SPA1^.1 




259 


11-28 


SPA1 654.1 


MIQINTELSrLFEtlTLVQ 


260 


10-27 


SPA2027.1 




261 


9-26 


SFA2207.1 




263 


4-16 


spoo4oai 




266 


1-18 


S=C0742.1 


KfKSHKERLTINTEKRQG 


273 


12-29 


SPC0875.1 


WHYGLKLSQVQIl-iTFPPL 


276 


6-23 


spcogo/.i 


aiElJ5HEI.EMNPHf3>mKRKLD 


277 


1-21 


spcioeai 


EGHIQLSNASINtmBS 


279 


47-64 


SPC1524.1 


IXdRCSMISSQKVIKEVR 


285 


28-45 


SPC1 527.1 


TLKRFTEOTll'lETDDNFW 


287 


18-35 


SPC1 903.1 


KPLVKVPPMRTtmPPNPP 


291 


14-31 


SPC1 964.1 


HQIQQKI'JKKBRPKPTWSK 


292 


7-24 


SPN0001.1 


TQQLEBKeSLSUHLLKHL 


299 


8-25 


SPN0003.1 


SGRQDSMLSHUSPKPSTLPS 


300 


1-20 


SPO0012.1 


PVISTEKKLIP3KWAV 


151 


18-33 


SPO0012.2 


AYKDSDLTLPA 


151 


62-72 


SPO0019.1 




152 


118-131 


SPO0Q31.1 


I^GmtETTRLPVKGQrSIDSPDEA 


154 


195-220 


SPO0031.2 


DSPDIfiYYESGC^VYWSWVTRGGYT 


154 


215-240 


SPCXX)31.4 


PIKEPAQSWCSNOTTKPSIKWaarVT 


154 


255-280 


SPO0103.1 


KPSIiSQLK?iD 


155 


72-81 


SPCX)112.1 




156 


174-186 


spooiiai 


YIHSHOTLraUDDEV 


157 


317-331 


SPO0166.1 




158 


35-59 


SPO0166.2 




158 


54-84 


SPCX)166.3 




158 


79-104 


SPCX)167.1 


SNKQOTS^TETTTTNBQPKPESSEtiT 


159 


33-58 


SPO0167.2 


KE1>IPLESM3<EE1KKSE[XCKKS 


159 


81-101 


SPO0167.3 


PKEGVKKaDKFTVIE 


159 


136-150 


SPO0167.4 


TlfPAAIiQiUamsiT 


159 


173-186 


SPO0167.5 


NLVNQIHHEHYSGGNTIiPABIQ 


159 


231-251 


SPO0171,1 


TPHHQrVKVSDVTLTETOMaWarVCre 


161 


22-48 


a=O0183.1 


MVKTGKNKSEIEKKTG 


162 


24-39 


SPO0230.1 


MSLPKSQiJTifVSDDD 


163 


475-489 




CSASGETKnSNTHDDSLPK 


164 


38-56 


SPOQ269.2 


lI.KQ5DSIM;QVRQI14DrKlSSL 


164 


583-604 


SPO0287.1 


EVIAQAGSQIKFSMDKLGPSVTTY 


165 


202-223 


SPO0287.2 


PSVTTYI SBRGRIEKDTiNIEWRLAVM 


165 


222-247 


SPCX)287.3 


VIAXAVblNEC31VIADm5nLIGQGSQ2^ 


165 


242-267 


SPO02SFA 




165 


262-287 


SPOa287.5 


GEDTR\TMYGQKIVQIILOiGVIIiER 


165 


282-307 


SPO0287.6 


GVTT.ERGTLTMGIGanjagiMGgiDB. 


165 


302-327 


SPOC1292.1 


YSVUAKHaiAVDIiESGKyiaEiaaA 


166 


25-48 


SPC3029ai 


SQn£S£MmAVK 


167 


204-217 


SPOQ84ai 


BGGTAEETE(PSI£KILIP 


168 


259-276 


spcxMiai 


PVNTEfVHEX'7VICTKSASJCS<G 


169 


121-139 


SPO0416.2 


DGSHDIDH 


169 


260-267 


SPO0416,4 


EDEI>EimE2IESEIff^EAEPK7a:KKHK]: 


169 


215-240 


SPO0430.1 


Q2HKRDPIiEfrElAElDDSQGGR|^lGRQ 


170 


115-140 




KPf^QULSEiniQPQnHMfEKGLEQ 


172 


182-204 


SPO0469,1 


DVPTTPETISA 


173 


144-153 




KTDISEiAPISANBPV 


173 


205-219 


SPOOwal 


TNPKGRQOTIT 


174 


196-206 


SPO0488.2 


TPGIPGKFKR 


174 


240-249 


oPOLhoo.3 


NI KVIDEKSTGRFEPF 


174 


272-287 


CDTW^OO A 

orULMott4 


KGR^\TXTYGDGSTDZXPPAVLl'}KK 






SPCX)48a5 


AVLtlKKGSVKBPTEADQSVG 


174 


218-237 


spoo5iai 


GOVYBGRMTGATSU'tl'PSREBIEXS 


175 


226-249 


SPOC615.2 


GnESVLKKVESGftSNKVERG 


175 


287-306 


SPOO580.1 


lABKLQDEIAELVKIDPKSI 


176 


430-449 


spooe2i.i 


LEWPRIQIEMMQKDG 


177 


361-375 


SPa3630.1 


MKPfgsnaAxrvQGGaQDcim: 


178 


241-260 


SPO0702.1 


flLVRKDGTHTAEVHESNOTP 


181 


483-502 


spoo7iai 


NiTLGIGQO'DKDGNRISIX'J 


182 


379-396 


SPO0720.1 


LGSQAGLKEaiagMFPDKKVL 


184 


31-51 


SPO0737.1 


DIAERaAKri^RLIimKrLTDDQBK 


186 


1436-1460 


SPO0737.2 


TiaXiRKE3Q!LLGVEiTEYM<GI 


186 


1455-1474 


SPC»737.3 




186 


1469-1487 


SPOa747.1 


VBDDSGKSIWHIIH 


187 


215-229 


SPO0747^ 


MLKPEKINLQKDLSKLSIASmCEMESA 


187 


534-561 



l|MR2P3P4P5P6l>7F8P»P10N1N2N3N4N5NBN7N9»BmDP H 



«s 




"Vi ^^^^ 



S5 SS^Sl 





A .SS5.h" 



L 



5SS 



M i8888SS 




. 5SS Sg? 





+ 


•H- 


+++ 


.core 






13 


13 






7 


7 






15 


11 


2 




7 


7 






17 


15 


1 




2 
35 


2 
10 


11 


1 


10 


7 




1 


18 


11 


2 


1 


35 


10 


11 


1 


16 


11 


1 


1 


31 


10 


4 


3 


5 


5 






15 


7 


1 


2 


19 


7 


3 


2 


23 


13 


2 


2 


13 


9 


2 




23 


10 


5 


1 


32 


11 


9 


1 


29 


12 


7 


1 


5 


5 






24 


10 


7 




31 


12 


8 


1 


17 


9 


4 




4 


4 






7 


7 






1 


1 






19 


17 


1 




9 


9 






24 


16 


4 




16 


14 


1 




10 


10 






16 


10 


1 




21 


18 




1 


17 


11 


3 




18 


16 


1 




3 


3 






36 


11 


6 


3 


5 


3 


1 




8 


6 






34 


11 


10 


1 


13 


11 


1 




40 


8 


6 


4 


2 




2 




18 


14 


2 




7 


7 






17 


15 


1 




1 


1 






16 


14 


1 




12 


8 


2 




2 


2 






26 


11 


1 


3 


33 


12 


9 


1 


31 


12 


8 


1 


24 


9 


2 


3 


12 


10 


1 




11 


9 


1 




19 


14 


1 


1 


25 


13 


6 




2 


2 






19 


13 


3 




12 


8 


2 




25 


15 


2 


2 


14 


12 


1 




6 


6 






21 


17 


2 




17 


7 


5 




14 


12 


1 




16 


14 


1 




28 


10 


6 


2 


20 


12 


4 




33 


14 


5 


3 


11 


7 


2 




21 


14 


2 


1 


12 
29 


10 
11 


1 
9 




14 


12 


1 
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SFCXI872.1 




192 


90-105 


SP03S722 




192 


112-136 


SP00895.1 




193 




SFO0972.1 


KCaSRGSKKSKTIAIWFIV 


194 


33-50 


SP00981.1 


PPT QPT .KHTIKTRNFTT . 


195 


76-90 


SF01 008.1 




196 


70-88 


SP01 032.1 




197 


418—442 


SFO1032^ 




197 


574—585 


SPO1054.1 




198 


87—104 








124-148 


SPO1054.3 




198 


'''41~248 


SPO1063.1 


IKHAES^MT 


199 




SF01162.1 




200 


183—198 


SPO1206.1 


GLVQ^TiSiQCSTElyB IVT 






5Pl/i228,1 




202 


202—217 














203 


69-93 


SPCn24o.3 








SRJ1245.4 




203 


107—127 


SP01357'.1 


KF32?iIEDE<EAlITRIEAfiSS 


205 


74—92 


SP01o51.1 


PGUIFPrSI33FQn'KK33IVQ/TKDSI 


206 


207—232 




VIKDbXIjVIjnDQnLnlrlbCiU^ 




227—25*' 


SHJ12o1.3 




^07 


^47~^^ 


SP01371,1 








5P\J1371.Z 


ADCyAAKI K 


207 


'»97 305 


op\jlo/1.o 


SVQlEEICDKDlTPEtEDrSAADEVEGL 




312—337 




T VTRTATi^iv\etiVYrn'Dr\T ctt 




667—384 


SPD1390.1 


rS®lM<IKEKAEBNIIA 


210 


279-295 




DIEYADEVTLLBAIENKrEL 


211 


■"■^^^^^ 


SF01494.1 


MilSATGDKEIXCSTLVEKEWREKD 








ruraPKi \w\ jTTufraiijgccni'i'iinnrpn 


213 


46—70 




TinftriRrivitii jyi Bin'R^yn'ri'i *<m 


213 


8?"w^ 


S^LJl494b4 


^^^^^^^^^^^^^^^^ 


213 






uiuiiji-Xjia^jjaWIMPs' « .'.^.unr.iDffi^EIR 


213 


112*141 


SP01536.1 




215 


248-260 


OI"W4EO>I -1 

bKJ1oo4.1 




216 




oHJI 607.1 


ACXCVSQKLEDKifCSKLP 


218 


'^7~73° 


oHJ1b1a,1 




219 




SP01G66.1 


EEWKEReJSiijVSWitJLljES 


220 


297—314 


SRD1727.1 


BI^SLEEEKEEWVIIVH 


221 


142-157 




ITT CKfTDEfBTIunWiilt 


222 


9^0— <m / 


SF01785i2 




222 


573-593 


SP01 798.1 


EKWIM^LQCnUDGDGKLSKTEL 


223 


523-544 


SP01 798.2 


RIIjVEEIGPYASQSASCEYYKHIEK 


223 


46-70 


SP01798.3 


VKHIEEQ! lVEN[WffiKSLE(3KrFD 


223 


65-89 


SFD1798.4 


GEE^TEDIIslYQGXKQC^DLiIKDGKHE 


223 


84-108 




TtiJ^MjSiJWTFISfiOQiaiTtiKJDSIKEQT/ 


223 


122-151 




KFE!DQEFkPSPSE(\n:n!ASASS 


224 


123-142 


OHU18I3.I 


EfnCTRY5SPWH*irflfrT f? 


225 


903-9^1 


^01821.1 


EVIGWVETOVELIWSEr 


226 


119—136 


SP01916.1 




227 


142—161 


SF01972.1 


AWRTCTT.TT .riKSKTffTftTTCV' 


228 


258-277 


SF01972.2 


GDftIBCVQPKDYLFKEIJCNinX?/EVKrJrDP 


228 


272—300 


SF01972.3 




S9 


295—322 


SPU1979.1 


JJlreUKaHLKLiC 1 IKxyUvwiMI'il il innt*H2l <1 liJta 






SPD1979.2 


^(K]DNl:'JUViJJi(EKINN'l'LlL3/IS!iEQn(VIK 


229 


278—304 


SFAJ1983.1 




230 


131-lBO 










isHJl 931.1 




231 


53~70 


SPCI0OOO.I 


NPIGSGE'iMVKEIYE'vAQiCS^EVFNP 


232 


184—208 




WVIiLE0^7rAIAfiIESGI3Vri^YATP 


232 


222—246 


oHJdXAJ.4 




232 


241—265 




riTDCMTRTDrT CT VN\JimcS.1 1 'I'llgDri 


232 


260—284 


orUdJUUlD 






279—303 


orUdJLUO 




232 


317—341 




VXAKETEVHE(DGSESI£SAk 


233 


678—696 










5HJtaJl0.2 


PTASGrKISP.FSSV^?3jTA. 




A eZA^A 01 
^O'l— 4bX 


orAJdJlD.1 




236 




SPQG016.2 


DPYTPPYGGRLCTGYEKBD 


236 


137-155, 166-184 


SPO0O16.3 


PCS^MaXGLSEGPS 


236 


215-228 


SPQ30iai 


TEVKANOXI-JEBEVI 


237 


37-51 


sposoia2 




237 


53-75 


SPQ0O18.3 


EE£N2ISD!%SBQ5XPBIXjDA, 


237 


232-251 


SPCfa318.4 


EKQISnftSRQGUlRDIIftS 


237 


318-336 


SPOZO^I 


AElJENPMbHKK 


238 


305-315 


sposces.a 


Q^lKD^F^PSPDCXX<IX^^PDiC^H2'v3 


238 


131-156 


SPOE059.1 


QJ3vt\GVEi:ftsaQrFGrT 


241 


258-275 


SPQ2127.1 




243 


107-137 




9 
15 
7 

6 
12 
15 
11 
7 
9 
13 
3 
5 
14 
14 
14 





5 

] 








"i 










5 


3 


1 






13 

4 
6 


9 
4 
6 


2 






9 


7 


1 






5 


5 








31 


11 


4 


4 




2 


2 








25 


13 


3 


2 




25 


15 


2 


2 




17 


15 


1 






8 


8 








7 


7 








19 


10 


3 


1 




14 


10 


2 






27 


12 


3 


3 




19 


13 


3 






17 


15 


1 






28 


15 


5 


1 




15 


11 


2 






37 


11 


5 


4 


1 


26 


14 


3 


2 




20 


14 


3 






12 


6 




2 




21 


11 


5 






17 


12 


1 


1 




17 


13 


2 






18 


12 


3 






12 


10 


1 






2 


2 








20 


16 


2 






23 


11 


1 


2 


1 


12 


10 


1 






34 


10 


10 




1 


27 


12 


6 


1 




18 


16 


1 






15 


12 




1 




15 


9 


3 






42 


5 


14 


3 




24 


16 


4 






22 


20 


1 






20 


16 


2 






11 


11 








23 


17 


3 






12 


12 








13 


11 


1 






26 


14 


6 






23 


9 


7 






30 


11 


3 


3 


1 


45 


6 


7 


7 


1 


51 


6 


6 


7 


3 


22 


8 


4 


2 




43 


5 


12 


3 


2 


18 


14 


2 






21 


13 


4 






^ 


8 


8 


6 




17 


11 


3 






27 


16 


4 


1 




8 


5 




1 




10 


10 








12 


10 


1 
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OKF 


Common name 


Gene distribution 
(present of 50) 


Amino acid 
substitutions (in 
strain M89) 


Homology (SP/EC) 


Seq. 
ID (DNA, 
Prot.) 














Spy0012 


Hypothetical protein 


50 


3/302 


3P0010-40%/None 


1, 151 




niitativp Qprrptpd ■orotein ^cell 

division and antibiotic 

tolerance) 


50 


0/300 


5P2216 - 44-49%/None 


2, 152 


Spy0025 


putative 

phosphoribosylformylglycina 
midine synthase n 


38 


0/303 


5P0045-85%/24% 


3,153 


Spy0031 


putative choline binding 
protein 


50 


0/297 


SP2201 - 42% (cbpD)/None 


4,154 


SpyOlOS 


putative competence protein 


50 


0/81 


SP2051-41%/None 


5,155 


Spy0112 


putative pyrroline 

carboxylate reductase 


50 


3/235 


SP0933-32%/34% 


6,156 


Spy0115 


putative glutamyl- 
aminopeptidase 


50 


6/306 


SP1865 - 76%/30% 


7,157 


Spy0166 


hypothetical protein 


50 


n.d. 


None/None 


8, 158 


Spy0167 


Streptolysin O 


50 


7/300 


SP1923-40% 
(Pneumolysin)/None 


9,159 


Spy0168 


hypothetical protein 


8 


19/126 


None/None 


10, 160 


Spy0171 


hypothetical protein 


18 


8/95 


None/None 


11, 161 


Spy0183 


putative glycine 
betaine/proline ABC 
transporter 


50 


0/297 


SP0151-39%/48% 


12, 162 


Spy0230 


putative ABC transporter 
(ATP-binding protein) 


50 


1/299 


SP2073 - 64%/32% 


13,163 


Spy0269 


putative surface exclusion 
protein 


50 


1/303 


None/None 


14,164 


Spy0287 


conserved hypothetical 
protein 


50 


1/307 


SP0868 - 71%/19% 


15, 165 


Spy0292 


penicillin-binding protein (D- 
alanyl-D-alanine car 


50 


1/359 


SP0872 - 47%/27% 


16, 166 


Spy0295 


oligopeptidepermease 


50 


2/269 


SP1889-69%/24% 


17, 167 


Spy0348 


putative 

aminodeoxychorismate lyase 


50 


1/307 


SP1518 - 47%/25% 


18, 168 


Spy0416 


putative cell envelope serine 
proteinase 


50 


4/314 


SP0641 - 22%/None 


19, 169 


Spy0430 


hypothetical protein 


13 


0/165# 


None/None 


20, 170 


Spy0433 


hypothetical protein 


21(27/49)1 


2/174# 


None/None 


21, 171 


Spy0437 


Hypothetical protein 


19 (34/49)1 


0/106# 


NTone/None 


22, 172 


Spy0469 


putative 42 kDa protein 


50 


6/313 


SP2063 - 44% (LysM 
protein)/None 


23, 173 


Spy0488 


hypothetical protein 


50 


9/178 


None/None 


24, 174 


Spy0515 


Putative sugar transferase 


50 


n.d. 


SP1075-26%/None 


25, 175 


Spy0580 


conserved hypothetical 
protein 


50 


0/297 


5P0908 - 72%/43% 


26, 176 


Spy0621 


conserved hypothetical 
protein 


50 


n.d. 


SP1290-72%/None 


27. 177 


Spy0630 


putative PTS dependent N- 
acetyl-galactosamine-nC 


50 


n.d. 


SP0324 - 79%/30% 


28, 178 
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Spy0681 


Kypothetical protein, phage 
associated 


27 


2/303ff 


Mone/None 


29, 179 


Spy0683 


putative minor capsid 
protein, phage associated 


25 


1/233 


None/None 


30, 180 


Spy0702 


Hypothetical protein 


22 


n.d. 


None/None 


31, 181 


Spy0710 


conserved hypothetical 
protein, phage associated 


32 


51/286# 


None/36% in 122 of 313aa 


32, 182 


Spy0711 


pyrogenic exotoxin C 
precursor, phage associated 

(speC) 


17 


1/225 


None/None 


33, 183 


opyu/zU 


conserved hypothetical 
protein 




Zf2./\J 


hriZvo " dU% (JJxlrl 1 
protein)/None 


O^A 1 QA 


opyu/Z/ 


Putative DNA gyrase, 

subunit B 


n.u. 


n.u. 


d1 UoUo- c5U7o/407b 


OC 1 DC 

oD, lc5o 


opyu/o/ 


putative extracellular matrix 
binding protein 




U/4oD# 


i\Jone/z/7b in ^4uor 4Ziaa 


lob 


Spy0747 


extracellular nuclease 


50 


0/179 


None/None 


37, 187 


Spy0777 


putative ATP-dependent 
exonuclease, subunit A 


50 


2/306 


SP1152 - 48%/22% 


38, 188 


Spy0789 


putative ABC-transporter 
(permease protein 


50 


1/231 


None/None 


39, 189 


Spy0839 


putative 

glycerophosphodiester 
phosphodieste 


50 


1/301 


SP0994 - 24%/31% in 121 of 
358aa 


40, 190 


Spy0843 


ceU surface protein 


50 


3/312 


None/None 


41, 191 


opyuo/z 


putative secreted 5'- 
nucleotidase 


o\j 




i\one/z/ /o m z/4 or o4/aa 


AO 1 OO 

4Z, lyz 


opyuoyo 


histidine protein kinase 


ou 




None/None 


AO^ 1 QO 


Spy0972 


putative terminase, large 
subunit - phage 


28 


1/314# 


None/None 


44, 194 


Spy0981 


hypothetical protein - phage 
associated 


23 


n.d. 


None/None 


45, 195 


SpylOOS 


streptococcal exotoxin H 
precursor (speH) 


15 (14/49)1 


l/223# 


None/None 


46,196 


Spyl032 


extracellular hyaluronate 

lyase 


50 (175 of 175, 
Hynes 2000) 


3/311 


SP0314-51%/None 


47, 197 


Spyl054 


putative collagen-like protein 
(SclC) 


26, (45/49)1 (50 of 
50, but varying 
number of repeats; 
Lukomski, 2001) 


n.d. 


None/None 


48, 198 


Spyl063 


putative periplasmic-iron- 
binding protein 


49/50 (49/49) ^ 


2/292# 


SP0243 - 52%, iron ABC 
transporter/26% in 161 of 
348aa 


49, 199 


Spyll62 


putative ribonuclease HE 


50 


3/240 


SP1156-67%/46% 


50, 200 


Spyl206 


putative ABC transporter 


50 


1/302 


SP0770 - 81%/30% 


51, 201 


Spyl228 


Putative lipoprotein 


49 


n.d. 


SP0845-57%/NonG 


52, 202 


Spyl245 


Putative ABC transporter 


50 


n.d. 


SP1400-64%/None 


53, 203 


Spyl315 


hypothetical protein 


50 


4/305 


SP1241-64%/32% 


54,204 


Spyl357 


protein GRAB (protein G- 
related alpha 2M-binding 

protein) 


49; 11 of 12 strains 
(Rasmussen, 1999) 


9/226; insertion of 
28 aa 


None/None 


55, 205 


Spy 1361 


putative internalin A 
precursor 


50 


7/295 


SP1004~26%in 283 of 
1039/None 


56, 206 


Spyl371 


putative NADP-dependent 

glyceraldehyde-3-phosphate 

dehydrogenase 


50 


2/308 


SP1119-71%/34% 


57, 207 


Spy 1375 


putative ribonucleotide 
reductase alpha-c 


50 


4/304 


SP1179 - 85%/49% 


58, 208 



wo 2004/078907 



-73- 



PCT/EP2004/002087 



Sdv1389 I 

c 


DUtative alanyl-tRNA 
jynthetase 


50 


0/309 £ 


5P1383-74%/40% 


59,209 


Spyl390 I 
1 


:>utative protease maturation 
protein 


50 


0/232 \ 


>P0981 - 42%/None 


60, 210 


Spyl422 ] 
1 


mutative recombination 
protein 


n.d. 


n.d. S 


5P1672-88%/64% 


61, 211 


Spyl436 


putative deoxyribonuclease 


25 


0/243# J 


5P1964-29%inl81of 
>74aa/None 


62, 212 


Spyl494 


lypothetical protein 


50 


13/282 I 


STone/None 


63, 213 


Spyl523 


cell division protein 


49 


2/329 


5P0690 - 27%/None 


64, 214 


Spyl536 


conserved hypothetical 
protein 


50 


9/280 


3P1967 - 57%/None 


65, 215 


Sr>vl564 


conserved hypothetical 
protein 


39 


n.d. 


Mone/None 


66, 216 


Snvl 604 


conserved hypothetical 
protein 


50 


1/233 


5P2143 - 47%/28% 


67,217 


S"ovlfi07 


conserved hypothetical 
protein 


50 


0/241 


5P1902 - 55%/None 


68,218 


Sr>vl615 


putative late competence 
protein 


50 


2/204 


SP2207-41%/None 


69, 219 


Spyl666 


conserved hypothetical 

protein 


50 


2/305 


5P0334 (yllC) - 78%/40% 


70, 220 


Spyl727 


conserved hypothetical 
protein 


50 


0/237 


SP0549 - 53%/None 


71, 221 


Spyl785 


putative ATP-dependent 
DNA helicase 


50 


1/306 


5P1697 - 71%/37% 


/Z, ZZZ 


Spyl798 


hypothetical protein 


50 


2/128 


None/None 




SpylSOl 


immimogenic secreted 
protein precursor homolog 


50 


6/313; insertion of 6 
aa 


SP2216-33%inll9of 
392aa/None 


74, 224 


Spyl813 


hypotiietical protein 


46 


47/433; insertion of 
9, deletion of 1 aa 


None/None 


75,225 


Spyl821 


putative translation 
elongation factor EF-P 


n,d. 


n.d. 


SP0435 - 94%/45% 


76, 226 


Spyl916 


putative phospho-beta-D- 
galactosidase 


n.d. 


n.d. 


SP1184-91%/83% 


77, 227 


Sr>vl972 


Pullulanase 


50 


1/233 


5P0268 - 53%, SP1118 - 
29%/25% in 352 of 657aa 


78, 228 




Qtrpntnkinasp A ■oreciii'sor 


50 


20.1% identical of 
309# 


None/None 


79, 229 


Spyl983 


collagen-like surface protein 
(SclD) 


50, (50 of 50, but 
size variation 
according to 

Lukomski, 2000 


n.d. 


None/None 


80, 230 


Spyl991 


anthranilate sjnnthase 
component 11 


50 


1/170 


SP1816-58%/47% 


81, 231 


Spy2000 


surface lipoprotein 


50 


0/307 


None/27% in 389 of 524aa 


82,232 


Spy2006 


hypothetical protein 


50 


0/234 


SP1003 - 36%, SP1174 - 37%, 

SP1004-33%,SP1175- 

48%/None 


83,233 


Spy2009 


hypothetical protein 


39 (38/49) ^ 


58/344; insertion of 
36, deletion of 4 aa 


None/None 


OA OO/l 


Spy2010 


C5A peptidase precursor 


n.d. 


n.d. 


SP0641 - 23% m 783 of 
2140aa/None 


85, 235 


Spy2016 


inhibitor of complement (Sic) 


47; mainly in Ml 
strains (Reid 2001) 


ll/269# 


None/None 


86,236 


Spy2018 


Ml-Protein 


n.d. 


n.d. 


None/None 


87, 237 


Spy2025 


immunogenic secreted 
protein precursor 


50 


3/296 


SP2216 - 31% in 138 of 
392aa/None 


88, 238 


Spy2039 


pyrogenic exotoxin B 


n.d. 


n.d. 


None/None 


89, 239 
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Spy2043 


mitogenic factor MFl (speF) 


50 


0/247 


Mone/None 


90, 240 


Spy2059 


jenicilUn-binding protein 2a 


50 


0/293 


5P2010 - 55% (pbp2A)/30% in 
539of 844aa 


91,241 




ribonucleoside-triphosphate 

reductase 


o\j 




'^P0202 - 80% /Tlrd'^^/(5n% 




Spy2127 


Hypothetical protein 


1 


n.d. 


None/None 


93,243 


Spy2191 


iypothetical protein 


50 


1/175 


SJone/None 


94,244 


Spv2211 


transmembrane protein 


50 


2/281 


SP2231 - 43%/None 


95, 245 














ARF0450 


iypothetical protein 


50 


5/191 


N[one/None 


96, 246 


ARF0569 


riypothetical protein 


n.d. 


n.d. 


Mone/None 


97, 247 


ARF0694 


biypothetical protein 


23 


1/122# 


Mone/None 


98, 248 


AKF0700 


liypothetical protein 


n.d. 


n.d. 


None/None 


99, 249 


ARF1007 


tiypothetical protein 


n.d. 


n.d. 


None/None 


100, 250 


ARF1145 


hypothetical protein 


n,d. 


n.d. 


None/None 


101,251 


ARF1208 


hypothetical protein 


n.d. 


n.d. 


None/None 


102, 252 


ARP1262 


hypothetical protein 


n.d. 


n.d. 


None/None 


103, 253 


ARF1294 


hypothetical protein 


50 


1/186 


39% with SA0131 (first 28 aa 

nf 67 trrr^tPiTi^ 


104, 254 


ARF1316 


h3?pothetical protein 


n.d. 


n.d. 


None/None 


105, 255 


AKF1352 


hypothetical protein 


n.d. 


n.d. 


38% with SA1142 (aa 265-295 


106, 256 


ARP1481 


hypothetical protein 


n.d. 


n.d. 


None/None 


107, 257 


ARF1557 


hypothetical protein 


n.d. 


n.d. 


None/None 


108, 258 


ARF1629 


hypothetical protein 


n.d. 


n.d. 


36% with SP0069 (aa 139-169 

L>1 ^J. J. dd prOlcllL^ 


109, 259 


AKF1654 


h5^othetical protein 


n.d. 


n.d. 


None/None 


110,260 


ARF2027 


hypothetical protein 


n.d. 


n.d. 


None/None 


111, 261 


AKF2093 


hypothetical protein 


n.d. 


n.d. 


None/None 


112, 262 


ARF2207 


hypothetical protein 


50 


n.d. 


38% with SP1006 (aa 7-37 of 
67 aa protein) 


113,263 


CRF0038 


hypothetical protein 


n.d. 


n.d. 


None/None 


114, 264 


CRF0122 


hypothetical protein 


n.d. 


n.d. 


Nfone/None 


115, 265 


CRF0406 


hypothetical protein 


n.d. 


n.d. 


None/None 


116, 266 


CRF0416 


hypothetical protein 


n.d. 


n.d. 


None/None 


117, 267 


CRF0507 


hypothetical protein 


n.d. 


n.d. 


None/None 


118, 268 


CRF0549 


hypothetical protein 


n.d. 


n.d. 


None/None 


119, 269 


CRP0569 


hypothetical protein 


n.d. 


n.d. 


None/None 


120, 270 


CRF0628 


hypothetical protein 


n.d. 


n.d. 


None/None 


121, 271 


CRF0727 


hypotlietical protein 


n.d. 


n.d. 


40% with SP0584 (aa21-60 of 

70aa protein) 


122, 272 


CRF0742 


hypothetical protein 


n.d. 


n.d. 


33% with SA0422 (aa 11-37 of 
42 aa protein, listed as 280 aa 
protein) 


123, 273 


CRF0784 


hypothetical protein 


n.d. 


n.d. 


None/None 


124,274 
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CRF0854 


tiypothetical protein 


n.d. 


n.d. 


^Jone/None 


125, 275 


CRF0875 


hypothetical protein 


n.d. 


n.d. 


N^one/N^one 


126 276 


CRF0907 


hypothetical protein 


n.d. 


n.d. 


Homology to lysosomal 
trafficking regulator LYST 
[Homo sapiens] 


127, 277 


CRF0979 


hypothetical protein 


n.d. 


n.d. 


NTone/None 


128, 278 


CIIF1068 


hypothetical protein 


50 


0/148 


None/None 


129, 279 


CRF1152 


hypothetical protein 


n.d. 


n.d. 


None/None 


130, 280 


CRF1203 


h3^otlietical protein 


n.d. 


n.d. 


None/None 


131, 281 


CI«J1225 


h3q30thetical protein 


n.d. 


n.d. 


None/None 


132, 282 


Ciari236 


hypothetical protein 


n.d. 


n.d. 


None/None 


133,283 


CRF1362 


hypothetical protein 


n.d. 


n.d. 


NFone/None 


134, 284 


CRF1524 


h^^pothetical protein 


n,d. 


n.d. 


None/None 


135, 285 


CRF1525 


hypothetical protein 


n.d. 


n.d. 


None/None 


136,286 


C]RF1527 


hypothetical protein 


n.d. 


n.d. 


None/None 


137, 287 


CRF1588 


hypothetical protein 


n.d. 


n.d. 


None/None 


138, 288 


CRF1649 


hypothetical protein 


n.d. 


n.d. 


None/None 


139, 289 


CRP1749 


hypothetical protein 


n.d. 


n.d. 


NTone/None 


140, 290 


CRF1903 


hypothetical protein 


50 


0/140 


None/None 


141, 291 


CEF1964 


hypothetical protein 


n.d. 


n.d. 


None/None 


142, 292 


CRF2055 


hypothetical protein 


n.d. 


n.d. 


None/None 


143, 293 


CIIF2091 


hj^othetical protein 


n.d. 


n.d. 


None/None 


144,294 


CRF2096 


hypothetical protein 


n.d. 


n.d. 


None/None 


145,295 


CRF2104 


hypothetical protein 


n.d. 


n.d. 


None/None 


146, 296 


CRF2116 


hypotlietical protein 


n.d. 


n.d. 


None/None 


147, 297 


CRF2153 


hypothetical protein 


n.d. 


n.d. 


None/None 


148,298 


NRFOOOl 


hypothetical protein 


50 


0/130 


AJRF in Oligo ABC 
transporter (not annotated by 
TIGR), 33% with SA0643 (aa 
107-162 of 469 aa protein) 


149, 299 


NKF0003 


hypothetical protein 


n.d. 


n.d. 


None/None 


150, 300 
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Table 4: Recombinant proteins used for immtinisation experiments in NMRI mice. 





1 Annfh 
(amino 
acids) 


Amino acids^ 


Solubility 


Protection^ 


JL otai size ui xne 
fragment cloned 
(Kbp) 


From 


to 






39 


374 


±1 L&UXU.JL/1.C> 


20 % no % 40 %^ 


1 008 






2 


108 




'50% no% 


0.321 


Spv 0269 


873 


36 


873 


Soluble 


40% (40%, 70%)c 


2.511 


Spy 0292 


410 


22 


410 


Insoluble 


"Tno/ /"f no/ ono/ \ 




bpy0416A 




33 


867 


DQluule 


oU /o v^iU /o, 4U /oj 


Z.OUZ 


Spy0416B 


1647 


736 


1617 


Solubilized 


0 % (0%, 40 %) 


2.646 


Spy0720 


313 


2 


313 


Insoluble 


60% (10%, 80%) 


0.939 


Spy0872 


670 


27 


640 


Solubilized 


60% (10%, 80%) 


1.839 


Spyl245 


288 


49 


288 


Soluble 


20% (10%, 40%) 


0.717 


Spyl357 


217 


33 


186 


Soluble 


40 % (30%, 90 %) 


0.459 


DpyiODi 




22 


792 


DOiUDie 


OU /o ^^OU /o, ^^U /o ^ 




c^-t^t-i nan 
bpylobfU 


obi 


21 


351 




OU /o (^iU /o, OU /o J 








31 


345 




20 % fO% 40 


0.942 


Spyl607 


258 


2 


258 




40 % (10 %, 40 %) 


0.771 


Spyl666 


337 


JLJ, 


OO/ 


Soluble 


50 % (30%, 90 %) 


0.945 


Spyl972 


1165 


45 


500 




40 % (30%, 90 %) 


1.365 


Spy2000 


542 


24 


542 


Soluble 


20 % (30%, 90 %) 


1.554 


Spy2025 


541 


27 


541 




40 % (40%, 70%) 


1.542 


Spy2191 


204 


36 


204 




50% (10%, 80%) 


0.504 
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Table 5: Variability of antigens in strains of S. pyogenes. 



Antigen 

( leal 1 Its 


Seq 
in 

1 L./ 


Residue in 


rxesiuue numDer 


Amino acid 

vdl loUUI lo 


opyuuo 1 




n 

V3 




U 




A 

A 




O 






V 


Zoo 


1 
1 






i-\ 

u 


oZo 


IN! 






1 
1 


OOO 


T 
1 


Spy0103 


155 


none 






Spy0269 


164 


H 


97 


N 






A 


150 


V 






A 


168 


V 






H 


482 


R 






N 


485 


K 






Q 


577 


E 






A 


610 


V 






L 


636 


M 






E 


640 


K 






P 


752 


S 






1 
1 


"7 a A 
/ D4 


V 






u 




r- 






IX 

f\ 


o/o 


K 


Spy0292 


166 


A 


214 


D 






Y 


309 


S 






T 


317 


N 






V 


318 


C 




— =— 


K 


319 


Q 


Spy0416 


169 


V 


1 


M 






F 


25 


M 






L 


26 


M 






V 


27 


M 






S 


38 


T 






M 


40 


T 






A 
/-\ 




T 
1 






O 

o 


oo 


p 






1 

l_ 


7R 


p 






s 


85 


p 






D 


87 


G 






S 


104 


P 






s 


110 


P 






n 


i \J \ 


A x ri 






o 




p 
i 






E 


215 


G 






H 


279^ 


A. S. T. G 






T 


395 


1 






D 


452 


N 






N 


478 


K 






G 


484 


D 






A 


547 


V 






S 


61 7*^ 


A. S. T. G 
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D 


723 


A 






H 


749 


R 






R 


770 


K 






P 


787 


S 






D 


804 


A 






T 


874 


M 






N 


913 


S 






H 


991 


Y 






N 


1080 


s 






V 


1238 


A 






D 


1313 


G 






V 


1349 


M 






A 


1393 


V 






N 


1479 


K 






1 


1487 


M 






D 


1516 


G 






N 


1555 


D 






T 


1560 


A 






S 


1599 


F 






S 


1605 


T 






T 


1617 


A 


Spy0720 


184 


A 


61 


T 




1 


63 


M 






K 


99 


Q 






K 


109 


Q 






N 


295 


S 


Spy0872 


192 


K 


178 


N 




P 


181 


S 






V 


253 


1 






A 


393 


V 






T 


600 


1 
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Claims: 

1. An isolated nucleic acid molecule encoding a hyperimmune serum reactive antigen or a fragment 
thereof comprising a nucleic acid sequence which is selected from the group consisting of: 

a) a nucleic acid molecule having at least 70% sequence identity to a nucleic acid molecule selected 
from Seq ID No 1, 4-8, 10-18, 20, 22, 24-32, 34-35, 38-40, 43-46, 49-51, 53-54, 57-61, 63, 65-71, 73, 
7b'77, 81-82, 88, 91-94 and 96-150., 

b) a nucleic acid molecule which is complementary to the nucleic acid molecule of a), 

c) a nucleic acid molecule comprising at least 15 sequential bases of the nucleic acid molecule of a) 
orb) 

d) a nucleic acid molecule which anneals under stringent hybridisation conditions to the nucleic 

acid molecule of a)^ b), or c) 

e) a nucleic acid molecule which, but for the degeneracy of the genetic code, would hybridise to the 
nucleic acid molecule defined in a), b), c) or d). 

2. The isolated nucleic acid molecule according to claim 1, wherein the sequence identity is at least 
80%, preferably at least 95%, especially 100%. 

3. An isolated nucleic acid molecule encoding a hyperimmune serum reactive antigen or a fragment 
thereof comprising a nucleic acid sequence selected from the group consisting of 

a) a nucleic acid molecule having at least 96% sequence identity to a nucleic acid molectde selected 
from Seq ID No 64. 

b) a nucleic acid molecule which is complementary to the nucleic acid molecule of a), 

c) a nucleic acid molecule comprising at least 15 sequential bases of the nucleic acid molecule of a) 
or b) 

d) a nucleic acid molecule which armeals imder stringent hybridisation conditions to the nucleic 
acid molecule of a), b) or c), 

e) a nucleic acid molecule which, but for the degeneracy of the genetic code, would hybridise to the 
nucleic acid defined in a), b), c) or d). 

4. An isolated nucleic acid molecule comprising a nucleic acid sequence selected from the group 
consisting of 

a) a nucleic acid molecule selected from Seq ID No 3, 36, 47-48, 55, 62, 72, 80, 84, 95, 

b) a nucleic acid molecule which is complementary to the nucleic acid of a), 

c) a nucleic acid molecule which, but for the degeneracy of the genetic code, would hybridise to the 
nucleic acid defined in a), b), c) or d). 

5. The nucleic acid molecule according to any one of the claims 1, 2, 3 or 4, wherein the nucleic acid is 
DNA. 

6. The nucleic acid molecule according to any one of the claims 1,2, 3, 4, or 5 wherein the nucleic acid 
isRNA. 

7. An isolated nucleic acid molecule according to any one of claims 1 to 5, wherein the nucleic acid 
molecule is isolated from a genomic DNA, especially from a S. pyogenes genomic DNA. 

8. A vector comprising a nucleic acid molecule according to any one of claims 1 to 7. 

9. A vector according to claim 8, wherein the vector is adapted for recombinant expression of the 
hyperimmune serum reactive antigens or fragment thereof encoded by the nucleic acid molecule 
according to any one of claims 1 to 7. 
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10. A host cell comprising the vector according to claim 8 or 9. 
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11. A hyperimmune serum-reactive antigen comprising an amino acid sequence being encoded by a 
nucleic acid molecule according to any one of the claims 1, 2, 5, 6 or 7 and fragments thereof, 

wherein the amino acid sequence is selected from the group consisting of Seq ID No 151, 154-158, 
160-168, 170, 172, 174-182, 184-185, 188-190, 193-196, 199-201, 203-204, 207-211, 213, 215-221, 223, 
225-227, 231-232, 238, 241-244 and 246-300. 

12. A hyperimmune serum-reactive antigen comprising an amino acid sequence being encoded by a 
nucleic acid molecule according to any one of the claims 3, 5, 6, or 7 and fragments thereof, 
wherein the amino acid sueqnece is selected from the group consisting of Seq ID No 214. 

13. A hyperimmune serum-reactive antigen comprising an amino acid sequence being encoded by a 
nucleic acid molecule according to any one of the claims 4, 5, 6, or 7 and fragments thereof, 
wherein the amino acid sequence is selected from the group consisting of Seq ID No 153, 186, 197- 
198, 205, 212, 222, 230, 234, 245. 



14. Fragments of hyperimmune serum-reactive antigens selected from the group consisting of peptides 
comprising amino acid sequences of column "predicted immunogenic aa" and "location of 
identified immunogenic region" of Table 2; the serum reactive epitopes of Table 2, especially 
peptides comprising amino acid 4-44, 57-65, 67-98, 101-107, 109-125, 131-144, 146-159, 168-173, 181- 
186, 191-200, 206-213, 229-245, 261-269, 288-301, 304-317, 323-328, 350-361, 374-384, 388-407, 416-425 
and 1-114 of Seq ID No 151; 5-17, 49-64, 77-82, 87-98, 118-125, 127-140, 142-150, 153-159, 191-207, 
212-218, 226-270, 274-287, 297-306, 325-331, 340-347, 352-369, 377-382, 390-395 and 29-226 of Seq ID 
No 152; 4-16, 20-26, 32-74, 76-87, 93-108, 116-141, 148-162, 165-180, 206-219, 221-228, 230-236, 239- 
245, 257-268, 313-328, 330-335, 353-359, 367-375, 394-403, 414-434, 437-444, 446-453, 456-464, 478-487, 
526-535, 541-552, 568-575, 577-584, 589-598, 610-618, 624-643, 653-665, 667-681, 697-718; 730-748, 
755-761, 773-794, 806-821, 823-831, 837-845, 862-877, 879-889, 896-919, 924-930, 935-940, 947-955, 
959-964, 969-986, 991-1002, 1012-1036, 1047-1056, 1067-1073, 1079-1085, 1088-1111, 1130-1135, 1148- 
1164, 1166-1173, 1185-1192, 1244-1254 and 919-929 of Seq ID No 153; 5-44, 62-74, 78-83, 99-105, 107- 
113, 124-134, 161-174, 176-194, 203-211, 216-237, 241-247, 253-266, 272-299, 323-349, 353-360 and 145- 
305 of Seq ID No 154; 15-39, 52-61, 72-81, 92-97 and 71-81 of Seq ID No 155; 13-19, 21-31, 40-108, 
115-122, 125-140, 158-180, 187-203, 210-223, 235-245 and 173-186 of Seq ID No 156; 5-12, 19-27, 29- 
39, 59-67, 71-78, 80-88, 92-104, 107-124, 129-142, 158-168, 185-191, 218-226, 230-243, 256-267, 272-277, 
283-291, 307-325, 331-344, 346-352 and 316-331 of Seq ID No 157; 6-28, 43-53, 60-76, 93-103 and 21- 
99 of Seq ID No 158; 10-30, 120-126, 145-151, 159-169, 174-182, 191-196, 201-206, 214-220, 222-232, 
254-272, 292-307, 313-323, 332-353, 361-369, 389-396, 401-415, 428-439, 465-481, 510-517, 560-568 and 
9-264 of Seq ID No 159; 5-29, 39-45, 107-128 and 1-112 of Seq ID No 160; 4-38, 42-50, 54-60, 65-71, 
91-102 and 21-56 of Seq ID No 161; 4-13, 19-25, 41-51, 54-62, 68-75, 79-89, 109-122, 130-136, 172-189, 
192-198, 217-224, 262-268, 270-276, 281-298, 315-324, 333-342, 353-370, 376-391 and 23-39 of Seq ID 
No 162; 6-41, 49-58, 62-103, 117-124, 147-166, 173-194, 204-211, 221-229, 255-261, 269-284, 288-310, 
319-325, 348-380, 383-389, 402-410, 424-443, 467-479, 496-517, 535-553, 555-565, 574-581, 583-591 and 
474-489 of Seq ID No 163; 8-35, 52-57, 66-73, 81-88, 108-114, 125-131, 160-167, 174-180, 230-235, 237- 
249, 254-262, 278-285, 308-314, 321-326, 344-353, 358-372, 376-383, 393-411, 439-446, 453-464, 471-480, 
485-492, 502-508, 523-529, 533-556, 558-563, 567-584:, 589-597, 605-619, 625-645, 647-666, 671-678, 
690-714, 721-728, 741-763, 766-773, 777-787, 792-802, 809-823, 849-864 and 37-241, 409-534, 582-604, 
743-804 of Seq ID No 164; 4-17, 24-36, 38-44, 59-67, 72-90, 92-121, 126-149, 151-159, 161-175, 197-215, 
217-227, 241-247, 257-264, 266-275, 277-284, 293-307, 315-321, 330-337, 345-350, 357-366, 385-416 and 
202-337 of Seq ID No 165; 4-20, 22-46, 49-70, 80-89, 96-103, 105-119, 123-129, 153-160, 181-223, 227- 
233, 236-243, 248-255, 261-269, 274-279, 283-299, 305-313, 315-332, 339-344, 349-362, 365-373, 380-388, 
391-397, 402-407 and 1-48 of Seq ID No 166; 18-37, 41-63, 100-106, 109-151, 153-167, 170-197, 199- 
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207, 212-229, 232-253, 273-297 and 203-217 of Seq ID No 167; 20-26, 54-61, 80-88, 94-101, 113-119, 

128- 136, 138-144, 156-188, 193-201, 209-217, 221-229, 239-244, 251-257, 270-278, 281-290, 308-315, 
319-332, 339-352, 370-381, 388-400, 411-417, 426-435, 468-482, 488-497, 499-506, 512-521 and 261-273 
of Seq ID No 168; 6-12, 16-36, 50-56, 86-92, 115-125, 143-152, 163-172, 193-203, 235-244, 280-289, 302- 
315, 325-348, 370-379, 399-405, 411-417, 419-429, 441-449, 463-472, 482-490, 500-516, 536-543, 561-569, 
587-594, 620-636, 647-653, 659-664, 677-685, 687-693, 713-719, 733-740, 746-754, 756-779, 792-799, 
808-817, 822-828, 851-865, 902-908, 920-938, 946-952, 969-976, 988-1005, 1018-1027, 1045-1057, 1063- 
1069, 1071-1078, 1090-1099, 1101-1109, 1113-1127, 1130-1137, 1162-1174, 1211-1221, 1234-1242, 1261- 
1268, 1278-1284, 1312-1317, 1319-1326, 1345-1353, 1366-1378, 1382-1394, 1396-1413, 1415-1424, 1442- 
1457, 1467-1474, 1482-1490, 1492-1530, 1537-1549, 1559-1576, 1611-1616, 1624-1641 and 1-414, 443- 
614, 997-1392 of Seq ID No 169; 14-42, 70-75, 90-100, 158-181 and 1-164 of Seq ID No 170; 4-21, 30- 
36, 54-82, 89-97, 105-118, 138-147 and 126-207 of Seq ID No 171; 4-21, 31-66, 96-104, 106-113, 131- 
142 and 180-204 of Seq ID No 172; 5-23, 31-36, 38-55, 65-74, 79-88, 101-129, 131-154, 156-165, 183- 
194, 225-237, 245-261, 264-271, 279-284, 287-297, 313-319, 327-336, 343-363, 380-386 and 11-197, 204- 
219, 258-372 of Seq ID No 173; 4-20, 34-41, 71-86, 100-110, 113-124, 133-143, 150-158, 160-166, 175- 
182, 191-197, 213-223, 233-239, 259-278, 298-322 and 195-289 of Seq ID No 174; 4-10, 21-35, 44-52, 
54-62, 67-73, 87-103, 106-135, 161-174, 177-192, 200-209, 216-223, 249-298, 304-312, 315-329 and 12- 
130 of Seq ID No 175; 10-27, 33-38, 48-55, 70-76, 96-107, 119-133, 141-147, 151-165, 183-190, 197-210, 
228-236, 245-250, 266-272, 289-295, 297-306, 308-315, 323-352, 357-371, 381-390, 394-401, 404-415, 
417-425, 427-462, 466-483, 485-496, 502-507, 520-529, 531-541, 553-570, 577-588, 591-596, 600-610, 
619-632, 642-665, 671-692, 694-707 and 434-444 of Seq ID No 176; 6-14, 16-25, 36-46, 52-70, 83-111, 

129- 138, 140-149, 153-166, 169-181, 188-206, 212-220, 223-259, 261-269, 274-282, 286-293, 297-306, 
313-319, 329-341, 343-359, 377-390, 409-415, 425-430 and 360-375 of Seq ID No 177; 4-26, 28-48, 54- 
62, 88-121, 147-162, 164-201, 203-237, 245-251 and 254-260 of Seq ID No 178; 12-21, 26-32, 66-72, 87- 
93, 98-112, 125-149, 179-203, 209-226, 233-242, 249-261, 266-271, 273-289, 293-318, 346-354, 360-371, 
391-400 and 369-382 of Seq ID No 179; 11-38, 44-65, 70-87, 129-135, 140-163, 171-177, 225-232, 238- 
249, 258-266, 271-280, 284-291, 295-300, 329-337, 344-352, 405-412, 416-424, 426-434, 436-455, 462-475, 
478-487 and 270-312 of Seq ID No 180; 5-17, 34-45, 59-69, 82-88, 117-129, 137-142, 158-165, 180-195, 
201-206, 219-226, 241-260, 269-279, 292-305, 312-321, 341-347, 362-381, 396-410, 413-432, 434-445, 
447-453, 482-487, 492-499, 507-516, 546-552, 556-565, 587-604 and 486-598 of Seq ID No 181; 4-15, 
17-32, 40-47, 67-78, 90-98, 101-107, 111-136, 161-171, 184-198, 208-214, 234-245, 247-254, 272-279, 288- 
298, 303-310, 315-320, 327-333, 338-349, 364-374 and 378-396 of Seq ID No 182; 5-27, 33-49, 51-57, 
74-81, 95-107, 130-137, 148-157, 173-184 and 75-235 of Seq ID No 183; 6-23, 47-53, 57-63, 75-82, 97- 
105, 113-122, 124-134, 142-153, 159-164, 169-179, 181-187, 192-208, 215-243, 247-257, 285-290, 303-310 
and 30-51 of Seq ID No 184; 17-29, 44-52, 59-73, 77-83, 86-92, 97-110, 118-153, 156-166, 173-179, 192- 
209, 225-231, 234-240, 245-251, 260-268, 274-279, 297-306, 328-340, 353-360, 369-382, 384-397, 414-423, 
431-436, 452-465, 492-498, 500-508, 516-552, 554-560, 568-574, 580-586, 609-617, 620-626, 641-647 and 
208-219 of Seq ID No 185; 4-26, 32-45, 58-72, 111-119, 137-143, 146-159, 187-193, 221-231, 235-242, 
250-273, 290-304, 311-321, 326-339, 341-347, 354-368, 397-403, 412-419, 426-432, 487-506, 580-592, 
619-628, 663-685, 707-716, 743-751, 770-776, 787-792, 850-859, 866-873, 882-888, 922-931, 957-963, 
975-981, 983-989, 1000-1008, 1023-1029, 1058-1064, 1089-1099, 1107-1114, 1139-1145, 1147-1156, 1217- 
1226, 1276-1281, 1329-1335, 1355-1366, 1382-1394, 1410-1416, 1418-1424, 1443-1451, 1461-1469, 1483- 
1489, 1491-1501, 1515-1522, 1538-1544, 1549-1561, 1587-1593, 1603-1613, 1625-1630, 1636-1641, 1684- 
1690, 1706-1723, 1765-1771, 1787-1804, 1850-1857, 1863-1894, 1897-1910, 1926-1935, 1937-1943, 1960- 
1983, 1991-2005, 2008-2014, 2018-2039 and 396-533, 1342-1502, 1672-1920 of Seq ID No 186; 4-25, 45- 
50, 53-65, 79-85, 87-92, 99-109, 126-137, 141-148, 156-183, 190-203, 212-217, 221-228, 235-242, 247-277, 
287-293, 300-319, 321-330, 341-361, 378-389, 394-406, 437-449, 455-461, 472-478, 482-491, 507-522, 
544-554, 576-582, 587-593, 611-621, 626-632, 649-661, 679-685, 696-704, 706-716, 726-736, 740-751, 
759-766, 786-792, 797-802, 810-822, 824-832, 843-852, 863-869, 874-879, 882-905 and 1-113, 210-232, 
250-423, 536-564 of Seq ID No 187; 4-16, 33-39, 43-49, 54-85, 107-123, 131-147, 157-169, 177-187, 198- 
209, 220-230, 238-248, 277-286, 293-301, 303-315, 319-379, 383-393, 402-414, 426-432, 439-449, 470-478, 
483-497, 502-535, 552-566, 571-582, 596-601, 608-620, 631-643, 651-656, 663-678, 680-699, 705-717, 
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724-732, 738-748, l^b-ie^, 766-772, 776-791, 796-810, 819-827, 829-841, 847-861, 866-871, 876-882, 
887-894, 909-934, 941-947, 957-969, 986-994, 998-1028, 1033-1070, 1073-1080, 1090-1096, 1098-1132, 
1134-1159, 1164-1172, 1174-1201 and 617-635 of Seq ID No 188; 7-25, 30-40, 42-64, 70-77, 85-118, 

120- 166, 169-199, 202-213, 222-244 and 190-203 of Seq ID No 189; 4-11, 15-53, 55-93, 95-113, 120-159, 
164-200, 210-243, 250-258, 261-283, 298-319, 327-340, 356-366, 369-376, 380-386, 394-406, 409-421, 
425-435, 442-454, 461-472, 480-490, 494-505, 507-514, 521-527, 533-544, 566-574 and 385-398 of Seq 
ID No 190; 5-36, 66-72, 120-127, 146-152, 159-168, 172-184, 205-210, 221-232, 234-243, 251-275, 295- 
305, 325-332, 367-373, 470-479, 482-487, 520-548, 592-600, 605-615, 627-642, 655-662, 664-698, 718-725, 
734-763, 776-7S4:, 798-809, 811-842, 845-852, 867-872, 879-888, 900-928, 933-940, 972-977, 982-1003 
and 12-190, 276-283, 666-806 of Seq ID No 191; 4-38, 63-68, 100-114, 160-173, 183-192, 195-210, 212- 
219, 221-238, 240-256, 258-266, 274-290, 301-311, 313-319, 332-341, 357-363, 395-401, 405-410, 420-426, 
435-450, 453-461, 468-475, 491-498, 510-518, 529-537, 545-552, 585-592, 602-611, 634-639, 650-664 and 
30-80, 89-105, 111-151 of Seq ID Mo 192; 7-29, 31-39, 47-54, 63-74, 81-94, 97-117, 122-127, 146-157, 
168-192, 195-204, 216-240, 251-259 and 195-203 of Seq ID No 193; 5-16, 28-34, 46-65, 79-94, 98-105, 
107-113, 120-134, 147-158, 163-172, 180-186, 226-233, 237-251, 253-259, 275-285, 287-294, 302-308, 
315-321, 334-344, 360-371, 399-412, 420-426 and 32-50 of Seq ID No 194; 8-20, 30-36, 71-79, 90-96, 
106-117, 125-138, 141-147, 166-174 and 75-90 of Seq ID No 195; 4-13, 15-33, 43-52, 63-85, 98-114, 131- 
139, 146-174, 186-192, 198-206, 227-233 and 69-88 of Seq ID No 196; 4-22, 29-35, 59-68, 153-170, 213- 
219, 224-238, 240-246, 263-270, 285-292, 301-321, 327-346, 356-371, 389-405, 411-418, 421-427, 430-437, 
450-467, 472-477, 482-487, 513-518, 531-538, 569-576, 606-614, 637-657, 662-667, 673-690, 743-753, 
760-767, 770-777, 786-802 and 96-230, 361-491, 572-585 of Seq ID No 197; 4-12, 21-36, 48-55, 74-82, 

121- 127, 195-203, 207-228, 247-262, 269-278, 280-289 and 102-210 of Seq ID No 198; 13-20, 23-31, 38- 
44, 78-107, 110-118, 122-144, 151-164, 176-182, 190-198, 209-216, 219-243, 251-256, 289-304, 306-313 
and 240-248 of Seq ID No 199; 5-26, 34-48, 57-77, 84-102, 116-132, 139-145, 150-162, 165-173, 176- 
187, 192-205, 216-221, 234-248, 250-260 and 182-198 of Seq ID No 200; 10-19, 26-44, 53-62, 69-87, 90- 
96, 121-127, 141-146, 148-158, 175-193, 204-259, 307-313, 334-348, 360-365, 370-401, 411-439, 441-450, 
455-462, 467-472, 488-504 and 41-56 of Seq ID No 201; 5-21, 36-42, 96-116, 123-130, 138-144, 146-157, 
184-201, 213-228, 252-259, 277-297, 308-313, 318-323, 327-333 and 202-217 of Seq ID No 202; 6-26, 
33-51, 72-90, 97-131, 147-154, 164-171, 187-216, 231-236, 260-269, 275-283 and 1-127 of Seq ID No 
203; 4-22, 24-38, 44-58, 72-88, 99-108, 110-117, 123-129, 131-137, 142-147, 167-178, 181-190, 206-214, 
217-223, 271-282, 290-305, 320-327, 329-336, 343-352, 354-364, 396-402, 425-434, 451-456, 471-477, 
485-491, 515-541, 544-583, 595-609, 611-626, 644-656, 660-681, 683-691, 695-718 and 297-458 of Seq 
ID No 204; 5-43, 92-102, 107-116, 120-130, 137-144, 155-163, 169-174, 193-213 and 24-135 of Seq ID 
No 205; 4-25, 61-69, 73-85, 88-95, 97-109, 111-130, 135-147, 150-157, 159-179, 182-201, 206-212, 224- 
248, 253-260, 287-295, 314-331, 338-344, 365-376, 396-405, 413-422, 424-430, 432-449, 478-485, 487-494, 
503-517, 522-536, 544-560, 564-578, 585-590, 597-613, 615-623, 629-636, 640-649, 662-671, 713-721 and 
176-330 of Seq ID No 206; 31-37, 41-52, 58-79, 82-105, 133-179, 184-193, 199-205, 209-226, 256-277, 
281-295, 297-314, 322-328, 331-337, 359-367, 379-395, 403-409, 417-432, 442-447, 451-460, 466-472 and 
46-62, 296-341 of Seq ID No 207; 23-29, 56-63, 67-74, 96-108, 122-132, 139-146, 152-159, 167-178, 189- 
196, 214-231, 247-265, 274-293, 301-309, 326-332, 356-363, 378-395, 406-412, 436-442, 445-451, 465-479, 
487-501, 528-555, 567-5S1, 583-599, 610-617, 622-629, 638-662, 681-686, 694-700, 711-716 and 667-684 
of Seq ID No 208; 20-51, 53-59, 109-115, 140-154, 185-191, 201-209, 212-218, 234-243, 253-263, 277- 
290, 303-313, 327-337, 342-349, 374-382, 394-410, 436-442, 464-477, 486-499, 521-530, 536-550, 560-566, 
569-583, 652-672, 680-686, 698-704, 718-746, 75^-770, 774-788, 802-827, 835-842, 861-869 and 258-416 
of Seq ID No 209; 7-25, 39-45, 59-70, 92-108, 116-127, 161-168, 202-211, 217-227, 229-239, 254-262, 
271-278, 291-300 and 278-295 of Seq ID Mo 210; 4-20, 27-33, 45-51, 53-62, 66-74, 81-88, 98-111, 124- 
130, 136-144, 156-179, 183-191 and 183-195 of Seq ID No 211; 12-24, 27-33, 43-49, 55-71, 77-85, 122- 
131, 168-177, 179-203, 209-214, 226-241 and 63-238 of Seq ID Mo 212; 4-19, 37-50, 120-126, 131-137, 
139-162, 177-195, 200-209, 211-218, 233-256, 260-268, 271-283, 288-308 and 1-141 of Seq ID Mo 213; 
11-17, 40-47, 57-63, 96-124, 141-162, 170-207, 223-235, 241-265, 271-277, 281-300, 312-318, 327-333, 
373-379 and 231-368 of Seq ID No 214; 9-33, 41-48, 57-79, 97-103, 113-138, 146-157, 165-186, 195-201, 
209-215, 223-229, 237-247, 277-286, 290-297, 328-342 and 247-260 of Seq ID No 215; 7-15, 39-45, 58- 
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64, 79-84, 97-127, 130-141, 163-176, 195-203, 216-225, 235-247, 254-264, 271-279 and 64-72 of Seq ID 

No 216; 4-12, 26-42, 46-65, 73-80, 82-94, 116-125, 135-146, 167-173, 183-190, 232-271, 274-282, 300-306, 
320-343, 351-362, 373-383, 385-391, 402-409, 414-426, 434-455, 460-466, 473-481, 485-503, 519-525, 
533-542, 55^-565, 599-624, 645-651, 675-693, 717-725, 751-758, 767-785, 792-797, 801-809, 819-825, 
831-836, 859-869, 890-897 and 222-362, 756-896 of Seq ID No 217; 11-17, 22-28, 52-69, 73-83, 86-97, 
123-148, 150-164, 166-177, 179-186, 188-199, 219-225, 229-243, 250-255 and 153-170 of Seq ID Mo 218; 
4-61, 71-80, 83-90, 92-128, 133-153, 167-182, 184-192, 198-212 and 56-73 of Seq ID Mo 219; 4-19, 26- 
37, 45-52, 58-66, 71-77, 84-92, 94-101, 107-118, 120-133, 156-168, 170-179, 208-216, 228-238, 253-273, 
280-296, 303-317, 326-334 and 298-312 of Seq ID Mo 220; 7-13, 27-35, 38-56, 85-108, 113-121, 123-160, 
163-169, 172-183, 188-200, 206-211, 219-238, 247-254 and 141-157 of Seq ID No 221; 23-39, 45-73, 86- 
103, 107-115, 125-132, 137-146, 148-158, 160-168, 172-179, 185-192, 200-207, 210-224, 233-239, 246-255, 
285-334, 338-352, 355-379, 383-389, 408-417, 423-429, 446-456, 460-473, 478-503, 522-540, 553-562, 
568-577, 596-602, 620-636, 640-649, 655-663 and 433-440, 572-593 of Seq ID No 222; 4-42, 46-58, 64- 
76, 118-124, 130-137, 148-156, 164-169, 175-182, 187-194, 203-218, 220-227, 241-246, 254-259, 264-270, 
275-289, 296-305, 309-314, 322-334, 342-354, 398-405, 419-426, 432-443, 462-475, 522-530, 552-567, 
593-607, 618-634, 636-647, 653-658, 662-670, 681-695, 698-707, 709-720, 732-742, 767-792, 794-822, 
828-842, 851-866, 881-890, 895-903, 928-934, 940-963, 978-986, 1003-1025, 1027-1043, 1058-1075, 1080- 
1087, 1095-1109, 1116-1122, 1133-1138, 1168-1174, 1179-1186, 1207-1214, 1248-1267 and 17-319, 417- 
563 of Seq ID No 223; 6-19, 23-33, 129-138, 140-150, 153-184, 190-198, 206-219, 235-245, 267-275, 284- 
289, 303-310, 322-328, 354-404, 407-413, 423-446, 453-462, 467-481, 491-500 and 46-187 of Seq ID No 
224; 4-34, 39-57, 78-86, 106-116, 141-151, 156-162, 165-172, 213-237, 252-260, 262-268, 272-279, 296- 
307, 332-338, 397-403, 406-416, 431-446, 448-453, 464-470, 503-515, 519-525, 534-540, 551-563, 578-593, 
646-668, 693-699, 703-719, 738-744, 748-759, 771-777, 807-813, 840-847, 870-876, 897-903, 910-925, 
967-976, 979-992 and 21-244, 381-499, 818-959 of Seq ID No 225; 19-29, 65-75, 90-109, 111-137, 155- 
165, 169-175 and 118-136 of Seq ID No 226; 15-20, 30-36, 55-63, 73-79, 90-117, 120-127, 136-149, 166- 
188, 195-203, 211-223, 242-255, 264-269, 281-287, 325-330, 334-341, 348-366, 395-408, 423-429, 436-444, 
452-465 and 147-155 of Seq ID No 227; 11-18, 21-53, 77-83, 91-98, 109-119, 142-163, 173-181, 193-208, 
216-227, 238-255, 261-268, 274-286, 290-297, 308-315, 326-332, 352-359, 377-395, 399-406, 418-426, 
428-438, 442-448, 458-465, 473-482, 488-499, 514-524, 543-553, 564-600, 623-632, 647-654, 660-669, 
672-678, 710-723, 739-749, 787-793, 820-828, 838-860, 889-895, 901-907, 924-939, 956-962, 969-976, 
991-999, 1012-1018, 1024-1029, 1035-1072, 1078-1091, 1142-1161 and 74-438 of Seq ID No 228; 4-31, 
41-52, 58-63, 65-73, 83-88, 102-117, 123-130, 150-172, 177-195, 207-217, 222-235, 247-253, 295-305, 315- 
328, 335-342, 359-365, 389-394, 404-413 and 156-420 of Seq ID No 229; 4-42, 56-69, 98-108, 120-125, 
210-216, 225-231, 276-285, 304-310, 313-318, 322-343 and 79-348 of Seq ID No 230; 12-21, 24-30, 42- 
50, 61-67, 69-85, 90-97, 110-143, 155-168 and 53-70 of Seq ID No 231; 4-26, 41-54, 71-78, 88-96, 116- 
127, 140-149, 151-158, 161-175, 190-196, 201-208, 220-226, 240-247, 266-281, 298-305, 308-318, 321-329, 
344-353, 370-378, 384-405, 418-426, 429-442, 457-463, 494-505, 514-522 and 183-341 of Seq ID No 232; 
4-27, 69-77, 79-101, 117-123, 126-142, 155-161, 171-186, 200-206, 213-231, 233-244, 258-263, 269-275, 
315-331, 337-346, 349-372, 376-381, 401-410, 424-445, 447-455, 463-470, 478-484, 520-536, 546-555, 
558-569, 580-597, 603-618, 628-638, 648-660, 668-683, 717-723, 765-771, 781-788, 792-806, 812-822 and 
92-231, 618-757 of Seq ID No 233; 11-47, 63-75, 108-117, 119-128, 133-143, 171-185, 190-196, 226-232, 
257-264, 278-283, 297-309, 332-338, 341-346, 351-358, 362-372 and 41-170 of Seq ID No 234; 6-26, 50- 
56, 83-89, 108-114, 123-131, 172-181, 194-200, 221-238, 241-259, 263-271, 284-292, 304-319, 321-335, 
353-358, 384-391, 408-417, 424-430, 442-448, 459-466, 487-500, 514-528, 541-556, 572-578, 595-601, 
605-613, 620-631, 634-648, 660-679, 686-693, 702-708, 716-725, 730-735, 749-755, 770-777, 805-811, 
831-837, 843-851, 854-860, 863-869, 895-901, 904-914, 922-929, 933-938, 947-952, 956-963, 1000-1005, 
1008-1014, 1021-1030, 1131-1137, 1154-1164, 1166-1174 and 20-487, 757-1153 of Seq ID No 235; 10- 
34, 67-78, 131-146, 160-175, 189-194, 201-214, 239-250, 265-271, 296-305 and 26-74, 91-100, 105-303 of 
Seq ID No 236; 9-15, 19-32, 109-122, 143-150, 171-180, 186-191, 209-217, 223-229, 260-273, 302-315, 
340-346, 353-359, 377-383, 389-406, 420-426, 460-480 and 10-223, 231-251, 264-297, 312-336 of Seq ID 
No 237; 5-28, 76-81, 180-195, 203-209, 211-219, 227-234, 242-252, 271-282, 317-325, 350-356, 358-364, 
394-400, 405-413, 417-424, 430-436, 443-449, 462-482, 488-498, 503-509, 525-537 and 22-344 of Seq ID 
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No 238; 5-28, 42-54, 77-83, 86-93, 98-104, 120-127, 145-159, 166-176, 181-187, 189-197, 213-218, 230- 
237, 263-271, 285-291, 299-305, 326-346, 368-375, 390-395 and 1-151 of Seq ID No 239; 6-34, 48-55, 
58-64, 84-101, 121-127, 143-149, 153-159, 163-170, 173-181, 216-225, 227-240, 248-254, 275-290, 349- 
364, 375-410, 412-418, 432-438, 445-451, 465-475, 488-496, 505-515, 558-564:, 571-579, 585-595, 604-613, 
626-643, 652-659, 677-686, 688-696, 702-709, 731-747, 777-795, 820-828, 836-842, 845-856, 863-868, 
874-882, 900-909, 926-943, 961-976, 980-986, 992-998, 1022-1034, 1044-1074, 1085-1096, 1101-1112, 
1117-1123, 1130-1147, 1181-1187, 1204-1211, 1213-1223, 1226-1239, 1242-1249, 1265-1271, 1273-1293, 
1300-1308, 1361-1367, 1378-1384, 1395-1406, 1420-1428, 1439-1446, 1454-1460, 1477-1487, 1509-1520, 
1526-1536, 1557-1574, 1585-1596, 1605-1617, 1621-1627, 1631-1637, 1648-1654, 1675-1689, 1692-1698, 
1700-1706, 1712-1719, 1743-1756 and 91-263 of Seq ID No 240; 4-16, 75-90, 101-136, 138-144, 158-164, 
171-177, 191-201, 214-222, 231-241, 284-290, 297-305, 311-321, 330-339, 352-369, 378-385, 403-412, 
414-422, 428-435, 457-473, 503-521, 546-554, 562-568, 571-582, 589-594, 600-608, 626-635, 652-669, 
687-702, 706-712, 718-724, 748-760, 770-775 and 261-272 of Seq ID Mo 241; 4-19, 30-41, 46-57, 62-68, 
75-92, 126-132, 149-156, 158-168, 171-184, 187-194, 210-216, 218-238, 245-253, 306-312, 323-329, 340- 
351, 365-373, 384-391, 399-405, 422-432, 454-465, 471-481, 502-519, 530-541, 550-562, 566-572, 576-582, 
593-599, 620-634, 637-643, 645-651, 657-664, 688-701 and 541-551 of Seq ID No 242; 6-11, 17-25, 53- 
58, 80-86, 91-99, 101-113, 123-131, 162-169, 181-188, 199-231, 245-252 and 84-254 of Seq ID No 243; 
13-30, 71-120, 125-137, 139-145, 184-199 and 61-78 of Seq ID No 244; 9-30, 38-53, 63-70, 74-97, 103- 
150, 158-175, 183-217, 225-253, 260-268, 272-286, 290-341, 352-428, 434-450, 453-460, 469-478, 513-525, 
527-534, 554-563, 586-600, 602-610, 624-640, 656-684, 707-729, 735-749, 757-763, 766-772, 779-788, 
799-805, 807-815, 819-826, 831-855 and 568-580 of Seq ID No 245; 11-21, 29-38 and 5-17 of Seq ID 
No 246; 2-9 of Seq ID No 247; 4-10, 16-28 and 7-18, 26-34 of Seq ID No 248; 10-16 and 1-15 of Seq 
ID No 249; 4-11 of Seq ID No 250; 4-40, 42-51 and 37-53 of Seq ID No 251; 4-21 and 22-29 of Seq 
ID No 252; 2-11 Seq ID No 253; 9-17, 32-44 and 1-22 of Seq ID No 254; 19-25, 27-32 and 15-34 of 
Seq ID No 255; 4-12, 15-22 and 11-33 of Seq ID No 256; 10-17, 24-30, 39-46, 51-70 and 51-61 of Seq 
ID No 257; 6-19 of Seq ID No 258; 6-11, 21-27, 31-54 and 11-29 of Seq ID No 259; 4-10, 13-45 and 
11-35 of Seq ID No 260; 4-14, 23-32 and 11-35 of Seq ID No 261; 14-39, 45-51 and 15-29 of Seq ID 
No 262; 4-11, 14-28 and 4-17 of Seq ID No 263; 4-16 and 2-16 of Seq ID No 264; 4-10, 12-19, 39-50 
and 6-22 of Seq ID No 265; 2-13 of Seq ID No 266; 4-11, 22-65 and 3-19 of Seq ID No 267; 17-23, 30- 
35, 39-46, 57-62 and 30-49 of Seq ID No 268; 4-19 and 14-22 of Seq ID No 269; 2-9 of Seq ID No 
270; 7-18, 30-43 and 4-12 of Seq ID No 271; 4-30, 39-47 and 5-22 of Seq ID No 272; 6-15 and 14-29 of 
Seq ID No 273; 4-34 and 23-35 of Seq ID No 274; 4-36, 44-57, 65-72 and 14-27 of Seq ID No 275; 4- 
18 and 11-20 of Seq ID No 276; 5-19 of Seq ID No 277; 18-36 and 6-20 of Seq ID No 278; 4-10, 19- 
34, 41-84, 96-104 and 50-63 of Seq ID No 279; 4-9, 19-27 and 8-21 of Seq ID No 280; 4-16, 18-28 and 
22-30 of Seq ID No 281; 4-15 and 21-35 of Seq ID No 282; 4-17 and 3-13 of Seq ID No 283; 4-12 and 
4-18 of Seq ID No 284; 4-24, 31-36 and 29-45 of Seq ID No 285; 12-22, 34-49 and 21-32 of Seq ID No 
286; 4-17 and 22-32 of Seq ID No 287; 4-16, 25-42 and 7-28 of Seq ID No 288; 4-10 and 7-20 of Seq 
ID No 289; 4-11, 16-36, 39-54 and 28-44 of Seq ID No 290; 5-20, 29-54 and 14-29 of Seq ID No 291; 
24-33 and 10-22 of Seq ID No 292; 10-51, 54-61 and 43-64 of Seq ID No 293; 7-13 and 2-17 of Seq ID 
No 294; 11-20 and 6-20 of Seq ID No 295; 4-30, 34-41 and 19-28 of Seq ID No 296; 11-21 of Seq ID 
No 297; 4-16, 21-26 and 9-38 of Seq ID No 298; 4-12, 15-27, 30-42, 66-72 and 10-24 of Seq ID No 299; 
8-17 and 11-20 of Seq ID No 300; and 2-19 of Seq ID No246; 1-12 of Seq ID No 247; 21-38 of Seq 
ID No 248; 2-22 of Seq ID No 254; 15-33 of Seq ID No 255; 11-32 of Seq ID No 256; 11-28 of Seq ID 
No 259; 10-27 of Seq ID Mo 260; 9-26 of Seq ID Mo 261; 4-16 of Seq ID Mo 263; 1-18 of Seq ID Mo 
266; 12-29 of Seq ID No 273; 6-23 of Seq ID No 276; 1-21 of Seq ID No 277; 47-64 of Seq ID No 279; 
28-45 of Seq ID Mo 285; 18-35 of Seq ID Mo 287; 14-31 of Seq ID Mo 291; 7-24 of Seq ID Mo 292; 8- 
25 of Seq ID Mo 299; 1-20 of Seq ID No 300; 18-33 of Seq ID Mo 151; 62-72 of Seq ID Mo 151; 118- 
131 of Seq ID Mo 152; 195-220 of Seq ID Mo 154; 215-240 of Seq ID Mo 154; 255-280 of Seq ID Mo 
154, 72-81 of Seq ID Mo 155; 174-186 of Seq ID No 156; 317-331 of Seq ID Mo 157; 35-59 of Seq ID 
No 158; 54-84 of Seq ID No 158; 79-104 of Seq ID No 158; 33-58 of Seq ID No 159; 81-101 of Seq ID 
No 159; 136-150 of Seq ID No 159; 173-186 of Seq ID No 159; 231-251 of Seq ID No 159; 22-48 of 
Seq ID No 161; 24-39 of Seq ID No 162; 475-489 of Seq ID No 163; 38-56 of Seq ID No 164; 583-604 
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of Seq ID No 164; 202-223 of Seq ID No 165; 222-247 of Seq ID No 165; 242-267 of Seq ID No 165; 
262-287 of Seq ID No 165; 282-307 of Seq ID No 165; 302-327 of Seq ID No 165; 25-48 of Seq ID No 
166; 204-217 of Seq ID No 167; 259-276 of Seq ID No 168; 121-139 of Seq ID No 169; 260-267 of Seq 
ID No 169; 215-240 of Seq ID No 169; 115-140 of Seq ID No 170; 182-204 of Seq ID No 172; 144-153 
of Seq ID Mo 173; 205-219 of Seq ID No 173; 196-206 of Seq ID No 174; 240-249 of Seq ID No 174; 
272-287 of Seq ID No 174; 199-223 of Seq ID Mo 174; 218-237 of Seq ID Mo 174; 226-249 of Seq ID 
No 175; 287-306 of Seq ID Mo 175; 430-449 of Seq ID Mo 176; 361-375 of Seq ID Mo 177; 241-260 of 
Seq ID Mo 178; 483-502 of Seq ID Mo 181; 379-396 of Seq ID Mo 182; 31-51 of Seq ID Mo 184; 1436- 
1460 of Seq ID Mo 186; 1455-1474 of Seq ID Mo 186; 1469-1487 of Seq ID Mo 186; 215-229 of Seq 
ID Mo 187; 534-561 of Seq ID Mo 187; 59-84 of Seq ID Mo 187; 79-104 of Seq ID Mo 187; 618-635 of 
Seq ID No 188; 191-203 of Seq ID No 189; 386-398 of Seq ID No 190; 65-83 of Seq ID No 191; 90- 
105 of Seq ID Mo 192; 112-136 of Seq ID Mo 192; 290-209 of Seq ID Mo 193; 33-50 of Seq ID Mo 
194; 76-90 of Seq ID Mo 195; 70-88 of Seq ID Mo 196; 418-442 of Seq ID Mo 197; 574-585 of Seq ID 
No 197; 87-104 of Seq ID No 198; 124-148 of Seq ID No 198; 141-152 of Seq ID No 198; 241-248 of 
Seq ID Mo 199; 183-198 of Seq ID No 200; 40-57 of Seq ID No 201; 202-217 of Seq ID Mo 202; 50-74 
of Seq ID No 203; 69-93 of Seq ID Mo 203; 88-112 of Seq ID Mo 203; 107-127 of Seq ID No 203; 74- 
92 of Seq ID No 205; 207-232 of Seq ID No 206; 227-252 of Seq ID No 206; 247-272 of Seq ID No 
206; 47-60 of Seq ID No 207; 297-305 of Seq ID No 207; 312-337 of Seq ID No 207; 667-384 of Seq 
ID No 208; 279-295 of Seq ID No 210; 179-198 of Seq ID No 211; 27-51 of Seq ID No 213; 46-70 of 
Seq ID No 213; 65-89 of Seq ID No 213; 84-108 of Seq ID No 213; 112-141 of Seq ID No 213; 248- 
260 of Seq ID No 215; 59-78 of Seq ID No 216; 154-170 of Seq ID No 218; 57-73 of Seq ID No 219; 
297-314 of Seq ID No 220; 142-157 of Seq ID No 221; 428-447 of Seq ID No 222; 573-593 of Seq ID 
No 222; 523-544 of Seq ID No 223; 46-70 of Seq ID No 223; 65-89 of Seq ID No 223; 84-108 of Seq 
ID No 223; 122-151 of Seq ID No 223; 123-142 of Seq ID No 224; 903-921 of Seq ID No 225; 119-136 
of Seq ID No 226; 142-161 of Seq ID No 227; 258-277 of Seq ID No 228; 272-300 of Seq ID No 228; 
295-322 of Seq ID No 228; 311-343 of Seq ID No 229; 278-304 of Seq ID No 229; 131-150 of Seq ID 
No 230; 195-218 of Seq ID No 230; 53-70 of Seq ID No 231; 184-208 of Seq ID No 232; 222-246 of 
Seq ID No 232; 241-265 of Seq ID No 232; 260-284 of Seq ID No 232; 279-303 of Seq ID No 232; 
317-341 of Seq ID No 232; 678-696 of Seq ID No 233; 88-114 of Seq ID No 235; 464-481 of Seq ID 
No 235; 153-172 of Seq ID No 236; 137-155, 166-184 of Seq ID No 236; 215-228 of Seq ID No 236; 
37-51 of Seq ID No 237; 53-75 of Seq ID No 237; 232-251 of Seq ID No 237; 318-336 of Seq ID No 
237; 305-315 of Seq ID No 238; 131-156 of Seq ID No 238; 258-275 of Seq ID No 241; 107-137 of Seq 
ID No 243; 138-162 of Seq ID No 243; 157-181 of Seq ID No 243; 195-227 of Seq ID No 243; 62-78 
of Seq ID No 244; 567-584 of Seq ID No 245. 

15. A process for producing a S. pyogenes hyperimmune serum reactive antigen or a fragment thereof 
according to any one of the claims 11 to 14 comprising expressing the nucleic acid molecule 
according to any one of claims 1 to 7. 

16. A process for producing a cell, which expresses a S. pyogenes hyperimmune serum reactive 
antigen or a fragment thereof according to any one of the claims 11 to 14 comprising transforming 
or transfecting a suitable host cell with the vector according to claim 8 or claim 9. 

17. A pharmaceutical composition, especially a vaccine, comprising a hyperimmune serum-reactive 
antigen or a fragment thereof, as defined in any one of claims 11 to 14 or a nucleic acid molecule 
according to any one of claims 1 to 7. 

18. A pharmaceutical composition, especially a vaccine, according to claim 17, characterized in that it 
further comprises an immunostimulatory substance, preferably selected from the group 
comprising polycationic polymers, especially polycationic peptides, immunostimulatory 
deoxynucleotides (ODNs), peptides containing at least two LysLeuLys motifs, neuroactive 
compounds, especially human growth hormone, alumn, Freund's complete or incomplete 
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adjuvants or combinations thereof. 

19. Use of a nucleic acid molecule according to any one of claims 1 to 7 or a hj^erimmune serum- 
reactive antigen or fragment thereof according to any one of claims 11 to 14 for the manufacture of 
a pharmaceutical preparation, especially for the manufacture of a vaccine against S. pyogenes 
infection. 

20 . An antibody, or at least an effective part thereof, which binds at least to a selective part of the 
hyperimmune serum-reactive antigen or a fragment thereof according to any one of claims 11 to 
14. 

21. An antibody according to claim 20, wherein the antibody is a monoclonal antibody. 

22. An antibody according to claim 20 or 21, wherein said effective part comprises Fab fragments. 

23. An antibody according to any one of claims 20 to 22, wherein the antibody is a chimeric antibody. 

24. An antibody according to any one of claims 20 to 23, wherein the antibody is a himianized 
antibody. 

25. A hybridoma cell line, which produces an antibody according to any one of claims 20 to 24. 

26. A method for producing an antibody according to claim 20, characterized by the following steps: 

• initiating an immtine response in a non-human animal by administrating an hj^erimmime 
serum-reactive antigen or a fragment thereof, as defined in any one of the claims 11 to 14, to 

said animal, 

• removing an antibody containing body fluid from said animal, and 

• producing the antibody by subjecting said antibody containing body fluid to further 
purification steps. 

27. Method for producing an antibody according to claim 21, characterized by the following steps: 

• initiating an immune response in a non-human animal by administrating an hyperimmune 
serum-reactive antigen or a fragment thereof, as defined in any one of the claims 12 to 15, to 
said animal, 

• removing the spleen or spleen cells from said animal, 

• producing hybridoma cells of said spleen or spleen cells, 

• selecting and cloning hybridoma cells specific for said hyperimmune serum-reactive antigens or 
a fragment thereof, 

• producing the antibody by cultivation of said cloned hybridoma cells and optionally further 
purification steps. 

28. Use of the antibodies according to any one of claims 20 to 24 for the preparation of a medicament 
for treating or preventing S. pyogenes infections. 

29. An antagonist which binds to the hyperimmune serum-reactive antigen or a fragment thereof 
according to any one of claims 11 to 14. 

30. A method for identifying an antagonist capable of binding to the hyperimmune serum-reactive 
antigen or fragment thereof according to any one of claims 11 to 14 comprising: 

a) contacting an isolated or immobilized hyperimmune serum-reactive antigen or a fragment 
thereof according to any one of claims 11 to 14 with a candidate antagonist under conditions to 
permit binding of said candidate antagonist to said hyperimmune serum-reactive antigen or 
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firagment, in the presence of a component capable of providing a detectable signal in resporise to 
the binding of the candidate antagonist to said hyperimmune serum reactive antigen or fragment 
thereof; and 

b) detecting the presence or absence of a signal generated in response to the binding of the 
antagonist to the hyperimmune serum reactive antigen or the fragment thereof. 

31. A method for identifying an antagonist capable of reducing or inhibiting the interaction activity of 
a hyperimmune serum-reactive antigen or a fragment thereof according to any one of claims 11 to 
14 to its interaction partner comprising: 

a) providing a hyperimmune serum reactive antigen or a hyperimmune 
fragment thereof according to any one of claims 11-14, 

b) providing an interaction partner to said hyperimmune serum reactive antigen or a fragment 
thereof, especially an antibody according to any one of the claims 20 to 24, 

c) allowing interaction of said hyperimmune serum reactive antigen or fragment thereof to said 
interaction partner to form a interaction complex, 

d) providing a candidate antagonist, 

e) allowing a competition reaction to occur between the candidate antagonist and the interaction 
complex , 

f) determining whether the candidate antagonist inhibits or reduces the interaction activities of the 
hyperimmune serum reactive antigen or the fragment thereof with the interaction partner. 

32. Use of any of the hyperimmune serum reactive antigen or fragment thereof according to any one of 
claims 11 to 14 for the isolation and/or purification and/or identification of an interaction partner of 
said hyperimmune serum reactive antigen or fragment thereof. 

33. A process for in vitro diagnosing a disease related to expression of the hyperimmune serum- 
reactive antigen or a fragment thereof according to any one of claims 11 to 14 comprising 
determining the presence of a nucleic acid sequence encoding said hyperimmune serum reactive 
antigen and fragment according to any one of claims 1 to 7 or the presence of the hyperimmune 
serum reactive antigen or fragment thereof according to any one of claims 11-14. 

34. A process for in vitro diagnosis of a bacterial infection, especially a S. pyogenes infection, 
comprising analysing for the presence of a nucleic acid sequence encoding said hyperimmxme 
serum reactive antigen and fragment according to any one of claims 1 to 7 or the presence of the 
hyperimmune serum reactive antigen or fragment thereof according to any one of claims 11 to 14. 

35. Use of the hyperimmune serum reactive antigen or fragment thereof according to any one of 
claims 11 to 14 for the generation of a peptide binding to said hyperimmune serum reactive 
antigen or fragment thereof, wherein the peptide is selected from the group comprising anticalines. 

36. Use of the hyperimmune serum-reactive antigen or fragment thereof according to any one of 
claims 11 to 14 for the manufacture of a functional nucleic acid, wherein the functional nucleic acid 
is selected from the group comprising aptamers and spiegelmers. 

37. Use of a nucleic acid molecule according to any one of claims 11 to 14 for the manufacture of a 

functional ribonucleic acid^ wherein the ftmctional ribonucleic acid is selected from the group 
comprising ribozymes, antisense nucleic acids and siRNA. 
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Sequence Listing 



SPy0012 

Seq ID 1 

ATGAGAAAATTATTAGCGGCTATGTTAATGAC I 11)11 rCTGACTCCTTTACCAGTGATTAGTACAGAAA/WW\CTTATATTTTCA 
AAAAATGGTGTTTATCAATTG/W\C/^AGATGTCGTTCAATC/Vi^CAC/VkTTCTAT^ 

GA/VACGTGTGCCTATAAAGACAGTGATTTAACTGTACCAGGAGGAAGATTAGGTGTAAATCAACCATTACTTAT^ 
TGGTTAACAAAGAATCTTTACCGGTTTTTGAGTTAGCTGATGGTAGCTATGTTGAGGCTAATCGACAATTGA^ 
GTACTTAATCAAGTAGATATAGATAGCTATTTTTGGAGACAAAAGAAACTTAGGCTTTATTCAGCCCCTTATGTTl^ 
AACAATTCCTTCTTCTTTTTTATTTGCTG/WVV^GTTCATGCCAGT 

TGATAAGGGCTGGGCATGACAAGAAGATCTAGTTCAATTTGATAACGGCATGTTAAAAGTCCAGGAAATGCTCTTACAAAAAT^ 

AATAACCCAAATTATTGAATTTTTGTAAAGCAACTCAACACACAAACAAGTGCTGGTATTAATGGTGATAAAAAAATGT^^ 

AAGTATCTCGAAGTTAGCACCAGTTTATATTGTTCAA/W\CAATTACAAAAAAAGAAATTAGCAGAG/^ 

CTA/^GATGTTAATCATTTTTATGGAGACTATGATCCATTGGG/V\GTGGTAAAATTAGTAAAATAGCTGATAATAM 

GTTGAAGACCTACTGAAAGCTGTAGCACAACAATCGGATAATGTAGCAACTAATATTTTAGGTTATTATCTATGTCATCAGTATGA 

TAAAGCTTTCCGCTCAGAGATAAAAGCTTTATCAGGTATCGATTGGGATATGGAGCAGCGCTTATT/V\CTTCTCGTTCAGCTGCA 

AATATGATGGAAGCTATTTATCATGAAAAAGGCCAAATTATTTCTTACCTTTCAAATACGGAATTT 

AATATTACTGTTCCAGTTGCACATAAAATTGGTGATGCTTATGATTATAAACATGACGTTGCTATTGTTTACGGTAA 

ATTTTGTCTATTTTTAGAAATAAATGAACGTATGAAGATATTAGGGCTATTGCAGATGAGGTTTATGGTATTTT 

SPy0019 
Seq ID 2 

ATGAAAAAAAGAATTTTATCAGCAGTTCTTGTAAGTGGTGTTACCCTCGGAGCAGCTACAACTGTAGGAGCGGAGGATTT^ 

ACTAAGATTGCTAAGCAAGArrCTATTATCTCAAATCTGACTACAGAGCAAAAAGCTGCACAGAATCAAGTTTGAGCGTTAGAGG 

GTGAAGTAAGTTCAGTACAATGTGAACAAGATAAACTGACCGCAAGAAATACAGAACTTGAGGCGCTTTCAAAGGGATT^ 

AG/W\TTAAGGCTGT/KACAAGTCAAATTGTTGCTGGT/KATGAAAAATTAA/W^TC/W^GCTCG^^ 

GTTCTGGTTATATT/^TGGACTTTTGAATTCTAAATG/^ATTTGTGATGTTGTAAACCGTT^ 

GTAACGCTAAATTGTTAGAACAACAAAAAGGTGATAAAGTTTCGGTTGAAG/WV^GC/\AGCTGCTAA 

GATTGCCGCTAATATGGCAATGGCTGAAGAAAACC/W^ATACATTAGGTACTC/^ACAAGCTAATTTGGTAGCTGC/^ACTGCA 

TTAGCTCTCCAATTAGCATGTGCTACTGAAGAT/W\GCT/\ATTTGGTAGCTCA/WKAGAAGCTGCAGAAAAAGC^^ 

GCTTAGCAGAAG/!^GAGGCTGCT/WKGTTAAGGCAC/s^G/VS^CAGGCTGCACAACAAGCAGCATCTGTTGAAGCAGC/\^^ 

GCTATTAGTCCAGCACCAC/s^GCTACTCCGGCAGGGC/W^GTAGTAATGCTATTG/su^CCAGCTGCACTCACGGCTCCGGGAGG 

TCCTTCTGCAGGACCACAAAGATCATATGATTCTTCTAATACTTATCCAGTTGGACAATGCACATGGGGAGCTAAATGTTTAGGT 

CCTTGGGCAGG/WSiTAATTGGGGAAATGGTGGTCAATGGGCTTATAGTGCTCAAGGAGCTGGTTATCGTAGTGGTTCAACGGC 

GATGGTAGGTGCGATTGCGGTTTGGAACGATGGTGGTTATGGACATGTCGCGGTTGTAGTTGAGGTTCAAAGTGCCTCAAGTAT 

TCGTGTGATGGAGTCTAACTACAGTGGTAGACAGTACATTGCTGACCACCGTGGGTGGTTTAATCCAACAGGTGTTAGATTTATT 

TATCCACACT/VK 



SPy0025 
Seq ID 3 

ATGTCCTCCTATTTTCCAGTCGCTGGGTTGTCGGACTTGGTATCTTATATGAATA/\ACGTATTTTTGTTGAG/WWkGGGTGAGTT 

TGGTATTAAATCGGCTAGTCTTGTGAAAGAGTTGACGGATAATCTACAACTGAGCTCTTTGAAGGCTTTGCGTATTGTG GAGGT C 

TATGATGTCTTCAATTTGGCTGAGGATTTGCTGGCGCGTGGTGAGAAGCATATTTTGTGTGAGGAGGTGACAGAGTGTCTTTTGA 

CGGAAACTGAAATGAGTGGGGAGCTTGAT/^AGGTTGCCTTGTTTGGGATTGAGGCGGTTCCTGGTC/V^TTTGACC^ 

CTAGTTCGGAAG/VAGGTTTGGTATTATTTGGAAGTGACAGTCAGGTTAAGGTCAATACAGCCCAGCTATACTTGGTCAAT^ 

TATTACAGAAGCAGAGCTTGAAGCGGTT/u^GAACTATCTTTTGAACGCTGTTGATTCGCGTTTCAAGGACATTACT^ 

G/W\GAGGAGGCTTTGTGTGTATGTGATAAGACGATCCCTAATCTTGATTTCTTTGAAACTTATGAAG 

TAAGGCAGAGCAGGGCTTGGCTATGGAGGTCGATGACCTTCTCTTCATCC/Wiu^TTATTTCAAATCA^^ 

TGAGACTGAGTTG/W^GTTTTGGATACTTACTGGTCAGACCAGTGCCGTGAGAGAACCTTTGAAACTGAATTGAAG/^ 

TTTTCAGGTTCTAAATTGGAAAAACAATTGCAGAG/VACTTATGACA/^ATATATCGCCATGGGTGATGAGGTTGGTCG^^ 

GCCACA/\ACACTTATGGATATGGCGAGTATTTTTGGTCGTTATGAGCGTGCG/s^GGGTCGTCTGGACGATATGG/KA 

TGAAATGAATGCCTGCTCAGTTGAGArrGAAGTAGATGTTGATGGTGTGAAAGAGCGTTGGCTCCTCATGTTTAAGAACGAGAC 

TGAC/KATCACCC/\ACAG/W\TTGAGGGATTGGGTGGAGCGGCGACTTGTATCGGTGGTGCTATTCGTGACGGATTGTGAGGAC 

GTTCATACGTTTATGAGGCTATGCGTATTTCAGGGGGAGGGGATATGACGACTCCGATTGCGGAAACACGTGCTGGTAAATTGC 

CACAAC/KAGTTATTTGTAAAACTGGGGCGCAGGGCTATTGTTCATATGGTAACCAAATTGGGCTTGCGACAACTTATGTGCGGG 

AGTACTTCCAGCCTGGCTTCGTAGCCAAACGTATGGAGCTTGGAGGTGTGGTTGGTGCTGCACCTAAGGAAAATGTGGTTCGT 

GAAAAAGCAGAAGGAGGGGATGTGGTCATCTTGCTCGGTGGTA/VAACAGGTCGTGATGGTGTCGGTGGTGCGACAGGTTCATC 

T/V\GGTTGAAACGGTTGAATCTGTGGAAACAGCTGGCGCAGAGGTACAAAAAGGGAATGCCATTGAAG/W\CGTAAGATTCAAC 

GTCTTTTGGGTGATGGC/^ATGTCACTCGTCTTATTAAGAAATCAAATGACTTCGGTGGAGGTGGTGTCTGTGTTGCCATCGGTG 

AATTGGGTGACGGTCTTGAAATCGATTTGGACAAGGTGCCTCTTAAATAGCAAGGTCTTAATGGTACTGAAATTGCAATCTCAGA 

ATCTCAAGAGCGTATGTCAGTCGTTGTTCGTCCAAATGATGTGGATGCCTTCATCGCAGCCTGCAACAAGGAAAATATCGATGC 

AGTCGTTGTTGCGACCGTTAGTGAAAAACCAAATCTTGTCATGACTTGGAATGGCGAAATCATCGTTGATTTGG/^CGCGG^ 

CTTGATACCAATGGTGTGCGTGTCGTTGTTGATGCTAAAGTCGTTGAGAAGGAGTTGACAGTTCCAGAAGCAGGCACAACATCA 

GCAGAGACACTTGAAGCAGATACGCTTAAGGTCTTGTCTGACCTCAACCACGCTAGTCAAAAAGGTCTTCAAACTATCTTTGACT 

CATCTGTTGGTCGTTCAACCGTTAACCACCGAATCGGTGGTCGTTACCAAATCACACCGACAGAAAGTTGTGTTCAAAAATTGGC 

AGTTCAACATGGTGTGAG/WKGAACTGCATCTGTTATGGCTCAAGGTTACAATCCTTATATTGCAGAGTGGTGACCTTATGAGGGT 

GCTGCCTATGCTGTCATTGAAGCGACAGCTGGGTTGGTAGC/iy^CGGGTGGTGACTGGTCTCGTGCACGTTTCTGTTACCAAGA 

GTACTTTGAGCGTATGGATAAACAGGGAGAGCGTTTTGGTCAGCGAGTATCAGCTCTTCTTGGTTCTATTGAGGCTCAGATTCA 

ACTTGGTTTGCCATCAATGGGCGGTAAGGACTCTATGTCTGGTACTTTCGAAGACTTGACAGTACCACCAAGCTTGGTAGGTTT 

CGGCGTGAGAAGAGGGGAGAGGGGCAAGGTTCTGTGTGGTGAGTTTAAAGCGGGTGGCGAAAACATTTAGTATATGGGAGGTG 

AAGCTATTTGAGAAGATATTGATTTTGACGTTATGAAGGAT/\ACTTTAGCCAGTTTGAAGCTATT^ 

GGTGCCTGAGCGGGTAAATACGGTGGTGTCGTAGAAAGTGT TGGT GTGATGAGTTTTGGTAACGGTATCGGTGCTTCTGTTGAA 

ATTGCAGAGCTTGAGAGGAGCTTGAGAGGTGAAGTGGGAGGTTTTGTGTTTAGATGAGCTGAGGAAATTGCTGAGGGGGTG/Wk 

ATGGGTGAAACTGAGGCAGACTTTAGAGTCAGTGTGAATGGA/^ATGAGGTTGCTGGCGCTAGCGTTCTAGGAGGCTTGGAAGGC 
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AAATTGGAAGAGGTTTACCCAACAGAATTTGAGCAGACAGATGTTCTTGAAGAAGTTCCTGCTGTGGTATCAGATACTGTTATCA 

AGGCT/VKGGAAACAATTGAAAAAGCAGTGGTTTACATTGGAGTCTTCCCTGGTAGCAACTCAGAATACGATTCAGCT^ 

TGAACAGGTTGGAGCTAGTGTCAACTTGGTACCATTTGTAACCTTGAATGAGGTTGCTATTGCTGAGTCAGTTGACAGTATGGTT 

GCTAATATTGCT/KAGGCAAATATCATCTTCTTTGCTGGAGGTTTCTCAGCAGCGGATGAACCAGATGGGTCTGCT^ 

TC/^ATATOTTGCTTAACGAG/\AGGTCCGCGCAGCTATTGACAGCTrCATCGAAAAAGGTGGCCTTATCATCGGTATCTGT^^ 

GTTTGC/SAGCCCTTGTTAAATCAGGTCTTCTTCCATACGGAAAGTTCGAGGAAGCTGGTGAGACAAGTCGAACT^^ 

TGATGCT/SJ\TCAGCATGTTGCCAAGATGGTTGAGACTGGTATCGCA/Su^TACCAACTCACCTTGGTTGGCAGGAGT^GAGG 

CGATATTCATGCCATTCCAGTTTCACATGGTGAAGGTAAACTTGTTGTCAGCGCTTCTGAATTTGCAGAGCTAAGAGAC/^^ 

CA/\ATCTGGAGCCAATATGTGGACTTTGACGGAGAACCATCTATGGATTCTAAATACAATCCAAACGGCTCTGTCAATGCCATCG 

AAGGGATTACCAGC/VKGAATGGTCAAATCATCGGT/^AGATGGGACACTCAGAACGCTGGGAAGACGGACTCTTCCAAAATATG 

CCTGGTAAC/Wi^GACCAAATCCTCTTTGCAAGTGCTGTAAAATACTTTACAGGGAAGTAA 

SPy0031 
Seq ID 4 

ATGAAAAAATTTCATCGTTTTTTGGTCTCAGGAGTAATCGTTTTAGGTTTTAATGGTCTAGTACCTACTATGCCATCTAGACT^^ 

TCGCAACAGGAAAATCTTGTTCATGCAGCTGTTTTAGGCGATAACTATCCGAGTAAGTGGAAAAAAGGCAATGGAATCGATTCG 

TGG/^CATGTATATCCGGCAATGCACTTCTTTTGCAGCTTTTCGTTTAAGCTCTGCTAATGGTTTTCAGTTACCTAAAGGCT^^ 

GTAATGCCTGCACGTGGGGACATATCGCGAAAAATCAGGGTTATCCTGTGAATAAGACACCAAGCATAGGGGCTATCGCTTGG 

TTTGATAA/W^CGCTTATCAGTCAAATGCTGCTTACGGTCATGTAGCATGGGTAGCTGATATCGGTGGAGACACTGTCACTATCG 

AAGAGTATAATTAC/V^CGCTGGACAAGGCCCTGAAAGATACCAT/\AGCGTCAAATTCCAAAATCTCAGGTAAGTGGTTATATCC^ 

TTTTAAAGACTTATCATCTCAGACAAGTCATTCCTAGCCAAGACAACTAAAACACATTTCTCAAGCTTCATTTGACCCCTCTGGAA 

CTTATCACTTTACAACCAGATTACGAGTCAAAGGAGAAACCAGTATCGATAGCCCTGATGTTGCTTACTATGAAGCAGGTCAATC 

TGTTTATTACGATAAAGTCGTGACTGCTGGAGGTTATACATGGCTTAGCTACCTCAGTTTTTCTGGAAACGGACGCTATATTCCC 

ATTAAAGAGCCCGCACAGTCTGTGGTTCAAAATGAGAATACAAAACCTTCCATTAAGGTCGGTGATACTGTTACCTTCCCTGGC 

GTTTTTCGTGTAGATGAGCTTGTTAATAATTTGATCGTTAATAAAGAATTAGGCGGAGGAGACCCAACTCCAGTAAAGTGGATT^ 

ATCCCACACCATTAGATGAAACAGATAACCAAGGAAAAGTTTTAGGAGATCAAATTCTCCGTGTGGGTGAATATTTTATCGTCAC 

TGGTAGTTATAAAGTATT/WW\TTGATGAAGCAAGTAATGGTATTTATGTTCAAATGGGATCTGGTGGAACATGGGTA/^ 

AT/KAAGCTAACAAATTATAG 

SPy0103 
Seq ID 5 

ATGATTAATCAATGG/KACAACTTACGACACAAGAAGCTAAAAGGATTTACTCTTCTAGAAATGTTATTGG^^ 
TGTTTTGATGCTATTATTTeTGCCT/\ATTT/\AGGAAGGAA/W^GAGAGGGTTACAG 

GTGGAGAATCAAGCAGAACTATATG/\ATTATCTCAAGGCTC/WSu^CC/VkGTTTGAGCCAGTT/W\GGCAG 
GAG/WVCAAGAAAAAGGTTATCAAGACTATTATGAC/\AACATAAAAATG/WW\GCCCGTCTT^ 

SPy0112 
Seq ID 6 

ATG/WVKTTGGCATTATTGGTGTTGGC/WV\TGGCTAGCGCTATGATGAAAGGCCTT/W\CAAACACCCCATGAAC^^ 

CAGGATGATCTTTAGAACGGTCCAAGGAAATTGCGGAGCAGTTAGCACTGCCTTATGCTATGTCCGAGCAAGAGCTTATTGAGG 

AGGTTGATCTTGTTATTTTAGGCATCAAGCGTCAACTATTTGAAAGGGTAGTC/W^CCGCTTGA 

ATGGCAGCAGGCATTTCCGTTCAACGAGTAGCAACATTCGTAGGACAAGAGCTTCCGCTGCTACGTATCATGCC/W\AGATGAAT 

GCACAAATTCTCCAAAGCAGTAGCGCTTTAACGGGAAATGCTTTGGTGTCCGAGG/VATTACAAGCAGGTGTTCGAGACTTAACA 

GATAGCTTTGGTAGGAGATTTGATATTAGTGAAAAGGATTTTGACACGTTTAGGGGTTTAGCAGGGTCAAGTCCTGCGTATATT^ 

ATCTCTTTATTGAGGCTTTGGGTAAGGCTGGCGTCAAGAATGGCATACGTAAAGGAAAGGCGCTGGAGATTGTTACTC/\AAGAG 

TATTGGCTAGCGGGAGC/KATGTGAAGAGGAGTTCTCAAAGTGGGGAGGATTTGATTGACGCTATTTGTAGGGCCGGTGGGAGAA 

CTATTGGTGGTCTGATGGAGTTAG/VyiCGCCTTGGCCTCACAGCTACTGTGAGGTCTGCCATTGAC/V/\AACCATCGATAAAGCTA 

/VAAGCTTGTAA 

SPy0115 
Seq ID 7 

ATGACAGACTTATTGTCAAAAATCAAAGAAGTTACCGAACTGGATGGCATTGCGGGCTATGAACATAGCGTTCGTGACTACCTA 

CGCACCAAAATAACCCCGGTGGTTGACCGTGTTGAAACAGACGGGCTTGGTGGCATTTTTGGTATCAGAGATAGTAAAG 

AAAGCCCCCCGTATTTTAGTAGCTGCGCACATGGACGAAGTCGGTTTTATGGTCAGTGATATC/W\GTTGACGGAACGCTACGC 

GTGGTTGGTATCGGTGGTTGGAAGCCACTTGTTGTCAGTTCAC/VkCGGTTTACCCTTTACACACGCACTGGCCAAGTTATTGCG 

CTTATTTCAGGATCGGTACCTCCGCATTTTTTACGTGGGGCAAATGGCTGTGGTAGTCTACCACATATCGAAGATATTGTGTTTG 

ATGGTGGCTTTACGGATAAGGCAG/SiAGCTG/WKGATTTGGTATTACACGGGGTGATATTATTATCCCTC/V^TCTGAAACGATCCT 

AACAGCC/\ATCAAAAAAATATTATTTGAAAAGCTTGGGAG/\ATCGGTATGGCGTTCTCATGATAA^^ 

AAAGGACAAGACCTT/^AC/\AGACGCT/^TTGGAGGTGCTAACGTTCAAGAAG/\AGTTGGTCTGCGCGGAGCCCACGTCTCAAG 
CACC/VAGTTCGACCCTG/iu\CTCTTTTTCGCAGTAGATTGTTCGCCTGCTGGTGATATTT 

GGTACCTTGTTGCGTTTCTACGACCCAGGCCATGTCATGCTC/W^GATATGCGCGACTTCTTACTGACTACTGCTGAGGAAGCT 
GGTGTCAATTTCGAATACTATTGTGGC/V^GGGAGGCACAGATGGAGGTGCTGCACACCTTCAAAATGGTGGTGTCCCATCAACA 
ACCATCGGAGTCTGTGGAGGCTACATTCACTCTCATCA/VAGCCTCTACGCTATGGATGATTTCGTAGAAGCCCAAGGGTTCTTAC 
AAGOCATTATCAAA/W*iCTGGATGGCTC/VSiCGGTTGACTTGATTA/\ATGTTACTAA 

SPy0166 
Seq ID 8 

ATGG/W\GATATTTCTGATCCAGAAGTTATTTTAGAGTATGGGGTTTACCGTGCTTTCATAAAAGGCTATACCCAAT^ 

CATCGAAGAAGGATTATTAGAAATGTGAAATAGCGGTCAAGCATTAGACATTTACCAAGCAGTTCAAACCCTAAACGCTGAA^ 

ATGTTATTAAATTATTACGAAAGCTTGCCATTTTATTTAAACCGTCAAAGCATACTAGGTAATATGACCAA^ 

GCATATTAGAGAGGGTATGGCAGATTAGAAATTAGGAGAATTTGCTGACTATGAAGATACTATGGTTGATATGGTCGAAAGA^ 

ATAAAAACATTTTAG 

SPy0167 

Seq ID 9 . 

ATGTCTAAT/WWW\CATTTAAAAAATACAGTCGCGTCGGTGGGGTAGTGAGGGGAGCTGTTATCATTGGT/^ 
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CTAATGCTGAATCGAACAAACAAAACACTGCTAGTACAGAAACCACAACGACAAATGAGCAACCAAAGCCAGAAAGTAGTGAGC 

TAACTACTGAAAAAGCAGGTCAGAAAACGGATGATATGCTTAACTCTAACGATATGATTAAGCTTGCTCCCAAAGAAATGC 

AGAATCTGCAGAAAAAGAAGAAAAAAAGTCAGAAGACAAAAAAAAGAGCGAAGAAGATCACACTGAAGAAATCAATGAC 

TTATTGACTAAATTATAATGAGCTTGAAGTACTTGGTAAAAATGGTG/WKCCATTG/WVVTT^ 

AAGCTGATAAATTTATTGTCATTGAAAGAAAGAAAAAAAATATC/VKCACTACACCAGTCGAT^ 

GATAGGACCTATCGAGCAGCCCTTCAGCTGGGTAATAAAGGTTTTAGCGAAAACAAACCAGACGCGGTAGTCACCAAGC^ 

CCCACAAA/VAATCCATATTGATTTACCAGGTATGGGAGACAAAGCAACGGTTGAGGTCAATGACCCTACCTATGCCA^^ 

ACAGCTATTGAT/V\TCTTGTT/V^CCAATGGCATGATAATTATTCTGGTGGTAATACGCTTCCTGCCAG/s^GAC/^^ 

AATGGTATATTCT/^GTCACAGATTG/\AGCAGCTCTAAATGTTAATAGCAAAATCTTAGATGGTAGTTTAGGCATTG 

CGATTTCAAAAGGTGAAAAG/VVGGTGATGATTGCAGCATACAAGCAAATTTTTTACACCGTATCAGC/^ 

TGCGGATGTGTTTGATAAATCGGTGACCTTTAAAGAGTTGCAACGAAAAGGTGTCAGCAATGAAGCTCCGCCAGTCTTTGTGAG 

TAACGTAGCCTATGGTCGAACTGTTTTTGTCAAACTAGAAAC/y\GTTCTAA/y\GTAATGATGTTGA^ 

CTAA.^<\GGAACAGATGTTAAAACTAATGGAAAATATTCTGATATCTTAGAAAATAGCTGATTTACAGCTGTCGTT^^ 

TGCTGCAGAGCACAATAAGGTAGTCACAAAAGACTTTGATGTTATTAGAAACGTTATGAAAGACAATGCTACCTTGAGTAGAA/^ 

AACGCAGCTTATCCTATTTCATACACCAGTGTTTTCCTTAAAAATAATAAAATTGCGGGTGTCAATAACAGAACTG^ 

AACAACATCTACCGAGTACACTAGTGGAAAAATTAACGTGTCTCATCAAGGCGCGTATGTTGCTCAATATGAAATCCTTTGGG 

GAAATCAATTATGATGACAAAGGAAAAGAAGTGATTACAAAACGACGTTGGGAC/VKCAACTGGTATAGTAAGACATCAC 

GCACAGTTATCGCACTAGGAGCTAATTCACGAAATATCCGTATCATGGCTAGAGAGTGCACTGGCTTAGCTTGGGAATGGTGGC 

GAAAAGTGATCGACGA/V^GAGATGTGAAACTGTCTAAAGAAATCAATGTCAATATCTCAGGATCAACCTTGAGCCCATATGGTTC 

GATTACTTATAAGTAG 

SPy0168 
Seq ID 10 

ATGAAACAACAATCTTACCAGCCTGTACGCTTCGTCTACCTGTTGGTGGCTCTATTTGCTGCTCTGTTGCTTATAGCAAGACCTG 

TTATGGCAGATGAGGGAACAAATAGTGCTGATGCGGCGTATrATAAAGGGCAAAGTGCTGGAAAAAAAGCAGGGAAAAAAGCT 

GGAAAAGAAGCTACTTGGACTGATTTGACCCCAACTGTCCCAACTAATCCAGAAACACCTAGTGACATCGGAGAGACTACTAAT 

/W\CAGGTCTATAAAGAAGGGTATAAAGATGGGTACAAAGAGGGTTATAATGAAGGCTGGAAATCTCAGTATCCCGTTTTGACT 

CCGGTGAAGGTTATATGGGATTTGATCTCTTATTGGCTACAGCGATTATTCCCCAATAATCAGTCAAGTACCGCAGCACAAAG 

TGTCATAA 

SPy0171 
Seq ID 11 

GTGA/W^ACA/VATTATTTTTAGTTGCCCTTGCGACCGT/s^CTGTCCTAGGGCCGTCTTTAGC^ 
CATGCTAGTGATGTAACATTAACTGAGACATGTGATAAAAACGG/\ACAGTATGTTTTGGCTACG/Ws^CGTAG^^^ 
GTAAATTAACAGCTGACGGAAAGGGAACCATTTGTGTGGGTTACGAAAATAGAGACATAAAAGAGAGTGAAACTTCTAGCACCA 
AAAATGATTGTTCT/\ATTGGTTTTGGTGCTTTTTA/W\TTATCTTTGGACTAC^^ 

SPy0183 
Seq ID 12 

ATGG/W\CAATTTTAGAAGTC/W\CATCTCAGTAAAATTTTTGGCAAAAAACAAAAAGCAGCTCTTGAGATGGTA^ 

AAAATAAGAGTGAGATTTTTAAGAAAACAGGCGCTACTGTAGGTGTCTATGACGCTAGTTTTGAGGTCAAAAAAGGTG/^ 

TGTTATTATGGGGCTATCAGGAAGTGGGAAATCAACCCTTGTCCGCATGCTAAATCGTTTGATTGAACCTTCAGCAGGATCTATC 

TTGCTGGAAGGTAAAGACATCTCAACGATGTCAGCAGATCAGCTGCGTGAGGTGCGCCGCCATGACATTAACATGGTTTTCCAA 

AGCTTTGCCGTCTTTCCTCAT/W\ACCATTTTGGAAAATACCGAATTTGGTTTGGAATTACGTGGCGTT 

AGGGATTGGCAGAAAAAGCGCTTGATAATTCAGGCCTATTAGATTTT/W\GACCAGTACCCAAACCAACTATCTGGTGGGAT 

AGCAGCGTGTCGGCCTAGCCCGTGCGCTAGGTAATAGCCCTAAAATTCTCTTAATGGACGAGGCATTCTCAGCGCTTGATCCTT 

TGATTGGTCGTGAGATGCAAGATG/^ATTACTTGATTTGCAAGACAGCATGAAACAAAGCATCATCTTTATCAGTCATGAGTTGAA 

TGAAGGCTTGCGGATTGGTGATCGGATTGCTTTGATGAAAGACGGACAAATTATGC/WVTTGGTACTGGTGAGGAAATCTTGAC 

TAACCCAGCCAATGACTTTGTGCGTGAATTCGTTGAGGATGTGGACCGTTCTAAAGTCTTGACAGCACAAAATATCATGATCAAA 

CCGTTAACAACTACTGTTGAATTAGATGGACCTC/\AGTTGCCTTGAACCGTATGCACAACGAAGAGGTGTCTATGTTGATGGCG 

ACGAATGGCCGCCGCCAATTAGTCGGTAGTTTGACGGCCGATGCCGCTATAGAGGCGCGCAAAAAAGGGTTACCGCTATCAGA 

AGTGATTGATCGGGATGTGAGAACTGTCTC/WSiAGATACTATTATTACAGATATTTTGCCTGTTATCTATGATTCATCTGCTGGGA 

TTGCAGTGACAGATGATAAT/\ATCGTCTGTTAGGTGTCATTATTCGAGGACGAGTGATTG/\AGCCTTGGCTAATATCTCAGACGA 

AGAGCTT/\ACT/V\ 

SPy0230 
Seq ID 13 

ATGAAAACAGCACG I lit! I GTGGTTTTATTTTAAACGCTATCGTTTCTCATTTAGTGTCATTGGTGTTGCCGTTATCTTAGCAACT 

TATTTAC/O^GTAAAAGCTCCTGTCTTCTTAGGAGAGTCCTTGACTGAGTTGGGAAAAATCGGTCAGGCTTA^^ 

TGAGTGGCCAGACACATTTTAGCCCTGATTTATCAGCTTTTAATGCCGTGATGTTT/\AGCT^ 

TTAGCTAATCT/VATATATAGTTTCTTACTTACACGTGTTGTCTCACATTCGACTAACCGCATGCGCAAGGGCTTAT^^ 

AGAACGTTTAACCGTCGCCTTTTTTGACCGCCAT/W\GATGGGGAGATTCTTTCTCGT^ 

AACTCGCTGAACC/s^TCCTTGATTCAAGTGGTGACTAATATTGCCCTTTACATCGGCCTGGTCTGGATGATGTTTAGGCAAGAT^ 
GCCGTTTAGCTTTGTTAACCATCGCATCAACCCCAGTTGCTCTCATTTTTTTAGTG 
AATATCC/VkCAGCAAGAAGTCAGTGCTTTAAATGCTTTTATGGATG/W\CCATTTCAGGACAA/^ 
TCCAAGAAGATACGATGACAGCCTTTTTAAAGCATAATGAAAGGGTTCGAC/iiAGCCACCTTCAAAC^^ 

AATTATTTCCAGTCATGAATGGAATGAGCCTTATTAACACGGCTATCGTGAI I I T I GTCGGTTCAACAATTGTCCTCAGTGACAAA 
TCTATGCCAGCAGCGGCAGCGCTTGGTTTAGTGGTTACTTTTGTACAATATTCCCAGC/KATATTACCAACCCATGATGCAA^ 

CGTCTAGTTGGGGAGAATTGCAGCTGGCCTTTACCGGTGCTCACCGTATTCAAGAAATGTTTGATGAAACCGAAGAAGTTCGTC 

CACAAAATGCACCAGCGTTCACCAGCTTAAAAGAAGCAGTGGCGATTAACCACGTCGATTTTGGGTATCTTGCTGGGCAAAAAG 

TTTTATCAGATGTGTCAATCGTTGCACCCAAGGGCAAAATGATTGCCGTGGTTGGACCGACAGGTTCTGGAAAGACCACTATTA 

TGAACTTGATTAACCGTTTCTACGATGTGGATGGAGGTTCGATTACCTTTGATGGCCGTGATATTCGTGACTAGGATTTGGATAG 

TCTTCGTC/WV\GGTAGGGATTGTGTTGCAAGAGTCAGTTCr I' I I I TCAGGAACCATTACGGATAATATTCGTTTTGGTGATCAG 

ACCATTAGTCAAGACATGGTTGAAACTGCTGCGCGTGCGACCCATATTCATGACTTTATCATGTCCTTACC/W\AGGGTACAATA 

GCTATGTCTCAGATGATGACAATGTCTTTTCAACAGGTGAAAAGGAGTTGATTTCTATTGGTAGGACGCTACTGAGTGAGGCTGA 

AGTGTTGATTTTGGATGAGGCCACTTCAAATGTTGATACGGTTACCGAAAGTAAAATTCAACGGGCCATGGAAGCTATC 
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AGGTCGAACTAGCTTTGTCATTGCTCACCGCCTCAAAACCATTTTAAATGCCGATCACATTATTGTGTTGAAAGATGGC/\AGGTC 
ATTGAGCAAGGAAATCATGATGAGCTATTGCATCAAAAAGGGTTTTATGCCGAATTGTATCAGAATCAATT^ 

SPy0269 
Seq ID 14 

ATGGACTTAGAACAAACGAAGCCA/^ACGAAGTTAAGCAGAAAATTGCTTT/VACCTCAACAATrGC 

GCGTATCTCACC/VAGTC/W5lGCAGATGATAGAGCCTCAGGAGA/VACG/\AGGCGAGTAATAGTCACGACGATAGTTTACC/WKAC 
CAGAAACAATTCAAGAGGCAAAGGC/\ACTATTGATGCAGTTGAAA/W\CTCTCAGTCAACAAAAAGCAGAAGT^^ 
CTACCGCTCTGACAAAAACTAGTGCTG/WSiTC/VKCCACTTAAAAGAGCAGC/\AGATAATGAAC/Su^^^ 
AG/W^TTTACACTAATACTCTTGCAAGTAGTGAGGAGACGCTATTAGCCCAAGGAGCCGAACATC/W^GAGAGT^ 

TGAAACAGAGGTTCATAATGCTCAAGCAGATCAACATTCAA^GAGACTGCATTGTCAGAACAAAAAGCTAGGATT^ 

ACTACTCGAGGTGAAGATTTAGTGGAACAAGTGAAAACGTCTGAACAAAATATTGGTAAGCTCAATGGTATGATTAGCAATC 

ATGCTATCACTAAAGCAGCTC/W\CGGCTAATGATAATACAAAAGCATTAAGCTCAGAATTGGAGA/^^.GGCTAAAGCTGAC 

AAATCAAAAAGCTAAAGTTAAAAAGCAATTGACTGAAGAGTTGGCAGCTCAGAAAGCTGCTCTAGCAGAAAAA 

TAGTCGTCTTAAATCCTCAGCTCCGTCTAGTCAAGATAGCATTGTGGGTAAT/KATACCATGAAAGCACCGCAAGGGTATCGTCTT 

GAAGAACTTAAAAAATTAGAAGCTAGTGGTTATATTGGATCAGCTAGTTACA^TAATTATTACAAAGAGCATGCAGATCAM 

TGCG/\AAGGTAGTCCAGGTAATCAATTAAATCAATACCAAGATATTCGAGCAGATCGTAATCGCTTTGTTGATCCC^ 

ACACCAGAAGTGCAAAATGAGCTAGCGCAGTTTGCAGCTCACATGATTAATAGTGTAAGAAGACAATTAGGTCTAGCACCAGTT 

ACTGTTACAGGAGGATCAC/VAGAATTTGCAAGATTACTTAGTACCAGCTATAAGAAAACTCATGGTAATACAAGACCATGAT^ 

TCTACGGACAGCCAGGGGTATCAGGGCATTATGGTGTTGGGCCTCATGATAAAACTATTATTGAAGACTCTGCCGGAGCGTCA 

GGGCTGATTCGAAATGATGATAACATGTACGAGAATATCGGTGCTTTTAACGATGTGCATACTGTGAATGGTATTAAACGTGGTA 

TTTATGAGAGTATCAAGTATATGCTCTTTACAGATCATTTACACGGAAATACATACGGCGATGCTATTAACTTTTTACGTGTAGAT 

AAAGATAACCCTAATGCGCCTGTTTACCTTGGATTTTCAACCAGCAATGTAGGATCTTTGAATGAAGACTTTGTAATGT^ 

GTCTAACATTGCTAACCATCAAGGCTTTAATAAGACCCCTATAAAAGCCGTTGG/ivAGTACAAAAGATTATGCCCAA^ 

AGTGTATCTGATACTATTGCAGCGATCAAAGGAAAAGTAAGCTCATTAGAAAATGGTTTGTCGGCTATTGATCAAGAAGGTGATA 

TTATGGCAGCCCAAGCTAAAGTAAGTCAACTTCAAGGTAAATTAGCAAGCAGACTTAAGCAGTCAGACAGCTTAAATCTCCAAGT 

GAGACAATTAAATGATACTAAAGGTTCTTTGAGAAGAG/KATTACTAGGAGCTAAAGC/WVVCAAGCACAACTG 

GATC/\ATCATTAGCTAAGCTAGCATCGTTGAAAGGGGCAGTGCACGAGACAG/VVGCCTTAGCAGAGCAAGGGGCAGGGAGAGT 

GACAGCACTGGTGGCTAAAAAAGCTCATTTGCAATATCTAAGGGAGTTT/W\TTG/V\TCCTAACCGCCT^ 

CGCATTGATAATACTAAGCAAGATTTGGCTAAAACTACCTCATCTTTGTTAAATGCACAAGAAGCTTTAGCAGC 

ACAAAGCAGTCTAGAAGCTACTATrGCTACCACAGAACACCAGTTGACTTTGCTT/W^CCTTAGCTAACGAAAAGG^ 

CACTTAGACG/>iAGATATAGCTACTGTGCGTGATTTGCAAGTAGCTCCACCTCTTACGGGCGT/WWKCCGCTATCATATAGTAAGA 

TAGATACTACTCCGCTTGTTCAAGAAATGGTTAAAGAAACGAAACAACTATTAGAAGCTTCAGC/iAGAT^^ 

/\AGTCTTGTAGCAGAAGCGCTTGTTGGCCAAACCTCTG/WS^TGGTAGC/\AGT/\ATGCCATTGTGTCTAAAATCACATC^^ 

ACTCAGCCCTGATCTAAGACATCTTATGGCTCAGGATCTTCTACAACGAGCAATCTCATTTCTGATGTTGATGAAAGT^^ 

GAGCTCTTA/iw^GCAGGAGTCGTCATGTTGGCAGCTGTCGGCCTCACAGGATTTAGGTTCCGTAAGGAATCTAAGTGA 

SPy0287 
Seq ID 15 

ATGACAAAAGAAAAACTAGTGGCTTTTTCGCAAGCCCACGCTGAGCCTGCTTGGCTGCAAGAACGGCGTTTAGCGGCATTA 

AGCCATTCCAAATTTGGAATTACCAACCATCGAAAGGGTTAAATTTCACCGTTGGAATCTAGGAGATGGTACCTTAACAGA^ 

GAAAGTGTAGCTAGTGTTCCAGATTTTATAGCTATTGGAGATAACCCAAAGCTTGTTCAGGTAGGCACGCAAACAGTCTTAGAAC 

AGTTACCAATGGCGTTAATTGACAAGGGAGTTGTTTTCAGTGATTTTTATACGGCGCTTGAGGAAATCCCAGAAGTAATTGA^ 

TCATTTTGGTCAGGGATTAGCTTTTGATGAAGACAAACTAGCTGCCTACCACACTGCTTATTTTAATAGCGCAGCCGTGCTCTAC 

GTTGCTGATCACTTGGAAATCACAACTCCTATTG/^GCTATTTTCTTACAAGATAGTGACAGTGAGGTTCCTTTT^ 

TCTAGTGATTGCAGGAAAAGAAAGTAAGTTCACCTATTTAGAGCGTTTTG/JvATCTATTGGCAATGCCACTCAAA^ 

AATATCAGTGTAGAAGTGATTGCTCAAGCAGGCAGCCAGATTAAATTCTCGGCTATCGACCGCTTAGGTCCTTCAGTGAC/W\CC 

TATATTAGCCGTCGAGGACGTTTAGAGAAGGATGGCAACATTGATTGGGCCTTAGCTGTGATGAATGAAGGCAATGTGATTGCT 

GATTTTGACAGTGATTTGATTGGTCAGGGCTCACAAGCTGATTTGAAAGTTGTTGCAGCCTCAAGTGGTGGTCAGGTAGAAGGT 

ATTGACACGCGCGTGACCAACTATGGTCAACGTACGGTCGGTCATATTTTACAGCATGGTGTGATTTTGGAACGTGGCACCTTA 

ACGTTTAACGGGATTGGTCATATTCTAAAAGACGCTAAGGGAGCTGATGCTCAAC/V\G/W\GCCGTGTTTTGATGCTTTCTGAC 

CAAGCAAGAGCCGATGCCAATCGAATCCTCTTAATTGATG/WJvATGAAGTAACAGCAGGTCATGGAGCTTCTATCGGTCAGGTT 

GACCCTGAAGATATGTATTACTTGATGAGTCGAGGACTGGATCAAGAAACAGCAGAACGATTGGTTATTAGAGGATTCGTAGGA 

GCGGTTATCGCTGAAATTCCTATTCCATCAGTCCGCCAAGAGATTATTAAGGTTTTAGATGAGAAATTGCTT/W\TCGTTA^ 

SPy0292 
Seq ID 16 

ATGATCAAACGATTAATTTCCCTAGTGGTCATCGCCTTATTTTTTGCAGCAAGCACTGTTAGCGGTGAAGAGTATT^ 

CT/V\GCATGCGATTGCCGTTGACCTTGAAAGTGGC/W\GTTTTATACGAAAAAGATGCTA>^ 

TCAGTAAGCTCTTGAC/KACCTATCTGGTTTACAAAGAAGTTTCT/VAGGGCAAGCTAAATTGGGATAGTCCTGT^ 

CTAGCCTTATGAACTCACTACAAACTATACTATTAGT/KACGTTCCTCTTGATAAGAG/WVATATACCGTTA^ 

CGTTAGTTGTT/V^TAACGCG/3^TAGCCCCGCTATTGCTTTAGCTGAAAAAATAGGCGGAACCG/\ACCCA/^ 

G/WW\ACAATTAAGAC/VVTGGGGCATTTCCGATGC/VAAGGTCGTCAATTCAACTGGCTTAACT/\ACCAT^ 

ACTTATCCTAATACAGAACCAGATGATGAAAATTGTTTTTGCGCCACTGATTTAGCTATTAT^ 

TCGAGAAGTACTGAAATTATCTAGCAAATCCTCCACTATTTTTGCTGGACAAACCATTTACAG^^^ 

TGCCTTGTTATCGAGAAGGCGTGGATGGTCI I I I I GTTGGTTATTCTAAAAAAGCCGGTGCTTCTTTTGTAGCTACTAGTGTCGA 

A/y\TCAAATGAGGGTTATTACAGTAGTTTTAAATGCTGATCAAAGCCACGAGGATGATTTAGCTATATTTA/^ 

TGTTGCAGTACGTTTTAATTAATTTTCAAAAAGTCCAGTTAATTGAAAATAATAAACCAGTAA/\^ 

CTGAAAAAACTGTCAAACTTGTAGCCCAAAATAGTTTATTTTTTATCAAACCAATACATAC/W\GA 

CTAAGAAATCATCCACAATGATCGGACCTCTATCAAAGGGACAAGTCTTAGGTAGAGCAACCCTTCAAGATAAACATCTTATTGG 
AGAAGGTTATGTGGATACTCCTCGTTCTATCAATGTTATCCTTCAAAAAAACATTTCTAAAAGTTTGI I i I I AAAGGTGTGGTGGAA 
CCGTTTTGTGAGGTATGTCAATACCTCTTTATAG 

SPy0295 
Seq ID 17 

ATGGAATCGATTGATAAATGTAAATTTCGATTTGTTGAGCGGGATAGTG/^AGCCTCCG/KAGTGATTGATACGCGTGGT^^ 
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ACTGGAAATCAGTGTTTCGTCAGTTTTTTTCTAAAAAATCTACAGTCTTTATGCTCGTAATT^ 
TTTATTTATCC/\ATGTTTGCCAACTACGACTTTAATGACGTTAGTAATATCAATGAGTTTT 

GAGTAGTGGTTTGGAACCGAGAAAAATGGGCAATGTCTGTTTGATGGTGTTTGGTATGGGGCACGTAATTCTATTTTAATCTCAG 

TTATAGCGACACTAATTAATATCACCATTGGGGTAGTGTTAGGAGCCATATGGGGAGTTTCTAAAGCATTTGATA^ 

TG/WKTTTATAACArrATCTCAAATATCCCTTCTATGCTTATTATCATTGTTTTGACCTATTCAT^^ 

GATTCTAGCTTTCTGTATCACTGGATGGATTGGTGTCGCCTAGTCCATCCGTGTTCAAATCTTGCGTTACCGTGATTTAG^^ 

AAGCTTGCTAGTGAAACTTTGGGAAGACCAATGTACAAGATTGCTGTTAAGAACCTCGTGCCTCAATTGGTT^ 

CTATGTTGTCACAAATGCTACCAGTTTATGTATCTTCTGAGGCCTTCTTATCCTTCTTTGGGATTGGTTTACC 

AGTTTAGGACGTTTTATTGCTAATTATTGAAGCAACTTAACAAGAAATGCCTACCTCTTTTGGATTCCCTT^^ 

GTATGGTTACCACTATAGATTGTGGGACAAAACTTGGCTGATGGCAGTGACCCACGTTCACATAGATAG 

SPy0348 

TTGGCTITGACGGATTTTAAGGATAAAGAC 

GC/W\TG/W\TCAGAAAAGAAAAAGAGGAAGAACTTTTTC/io^V^ 

CTATATGCTG/\ATATAAAAGACAAGATGCTTTTCAAAAAGAGTCTATAGCACAT/\ACAATAAGACAGGTA^ 
AAAAGGTGCGGTAATGACTTCAGAAGCGCTTAAACCGACTTTACTTTCTGAAAAAG/V^AACT^^^ 

AGAGTCGTGCAGGCAAATGAGCTTC/KAGAGACTGCCTCTAA^GAATCTCAAGTACCGTTAACTATTGAGAAAGGTCATTCAG 

AGACGAAAATTAAGCAAACGCCAACAGACTGAGCGAGCTGCTAAAAAGATTTCAACCGTTTTGATT^ 

TTTTGGCTGTTACTCTAGCAGGAGCAGGCTATGTTTATAGTGCTTTAAATCCTGTTGATAAAAATAGTGATGCCTT^ 

GAGATTCCATCTGGGTCAGGCAATAAATTGATTGGTCAAATTCTTCAAAAAAAAGGTTT/y\TCAAGAATAGC 

TTATACAAAATTTAAAAACTTTACAAATTTTCAGAGGGGGTATTAT^ 

CTTTACAAGAAGGTGGTACAGCAGAACCTACCAAGCCATCTGTTGGGAAGATCTTGATTCCAGAAGGATACACGATTAAAC/^ 

TAGCT/W^GCTGTTGAGCATAATAGCAAGGGAAAGACCAAAAAAGCTAAAACACCTTTTAACG 

CACGGATGAGGCTTTTATTCAAGATATGGTAAAAAGATATCCAAAATTATTAGGAACTATCGC/^GTAA^ 

GTTTGGAAGGTTACCTTTTCGCAGCAACCTATAACTATTACAAAGAAACTACCATGAGAGAACTTGTAGAGGACATGCTGGGA^ 

CTATGGATGCTAGTTTGGTAGCGTATTATGATAAAATTGGTGCTAGTGGTAAGAGAGTGAACGAGGTATTGAGGTTGGCGTCTT^ 

GGTTGAAAAAGAAGGTTGAACAGACGATGACAGACGTGAAATTGCAAGTGTCTTTTATAACCGCCTT^^ 

AGAATCTAATATAGGTATTTTGTATGCGATGGGGAAACTTGGTGAGAAAAGAACGTTGGGTGAGGATGCTACTATT^^^ 

ATTAArrCTCGTTATAATATTTATACCAATACAGGTCTGATGGGTGGTGCAGTTGGTAGCTCGGGGGTTTCTGCAATTGAAGCA^ 

CCCTAAATCCAGCCTCAACGGATTATTTATACTITGTGGCGAATGTCCATACTGGTGAAGTTTACTATGGAAAAA^^ 

CACTCTGCAAATGTTGAAAAATATGTGAATAGTCAAATTCAGTAA 

SPy0416 

GTGGAGAAAAAGCAACGTT^ 

CAACAACAGTAGGAGGAGATGAGGTAAGCACAATGAGCGAACGAACAATCACGAATCACGCTCAACAACA^^ 

CCAATAGAGAGTTGAGCTCAGGTGAATCAAAATCTCAAGACACATCAGAAATCACTCTCAAGACAAATCGTGA 

ACAAGATGTAGTGTGTGAGGGAACCACAACTGAGCTAGGTGACACAGATGCAGCATCAATGGCTAATACAGGTTCTGA^^ 

TCAAAAAAGCGCTTGTTTAGGGGCAGTCAATACAGATGTTCACGATTGGGTAAAAACCAAAGGAGCT^^^ 

AGGACAAGGGAAGGTTGTCGGAGTTATTGAGACAGGGATCGATCCGGCCCATCAAAGCATGCGCATC^^ 

GTAAAGTAAAATCAAAAGAAGACATGCTAGCACGCCAAAAAGCCGCCGGTATTAATTATGGGAGTTGGATAAATGAT 

TTTTGGAGATAATTATGTGGAAAATAGCGATAATATCAAAGAAAATCAATTCGAGGATTTTGATGAGGAGTG^ 

TTGATGCAGAGGCAGAGCCAAAAGCCATCAAAAAACACAAGATCTATCGTCCGCAATCAACCCAGGCACCGAAAGAAAGTGTT^ 

TCAAAACAGAAGAAACAGATGGTTCACATGATATTGACTGGACACAAACAGACGATGAGACCAAATACGAGTCAC^^^^^ 

ATGTGACAGGTATTGTAGCCGGTAATAGCAAAGAAGCCGCTGCTACTGGAGAAGGCTTTTTAGGAATTGCAC^^ 

GTCATGTTCATGCGTGTTTTTGCCAACGACATCATGGGAT^^ 

TAGGAGCAGATGTGATCAACCTGAGTCTTGGAACCGCTAATGGGGCACAGCTTAGTGGCAGCAAGCCTCTAATG^^^^ 

GAAAAAGCTAAAAAAGCCGGTGTATCAGTTGTTGTAGCAGCAGGAAATGAGCGCGTCTATGGATCTGACCAT 

GCG^^AAATCCAGACTATGGTTTGGTCGGTTGTCCCTCAACAGGT 

GTGATTCAACGTCTAATGACGGTCAAAGAATTAGAAAACCGTGCCGATTTAAACCATGGTAAAGCCATGTATTCAGA^^^^ 

ACTTTAAAGACATAAAAGATAGCCTAGGTTATGATAAATCGCATCAATTTGCTTATGTCAAAGAGTCAACTG^G^^ 

GCACAAGACGTTAAAGGTAAAATTGCTTTAATTGAACGTGATCGCAATAAAACCTATGACGAAATGATTGCTY^^ 

ATGGAGCTCTGGGAGTACTTATTTTTAATAACAAGGCTGGTCAATCAAACCGCTGAATGCGTCTAACAGC^^ 

TACCATCTGCTTTCATATCGCAGGAATTTGGTAAGGCGATGTCCCAATTAAATGGCAATGGTACAGGAAGTTTA^^ 

TGTGGTCTCAAAAGGACCGAGTCAAAAAGGCAATGAAATGAATCATTTTTCAAATTGGGGCCTAACTTCTGATGGGTATl^^ 

CCTGACATTACTGCACCAGGTGGCGATATCTATTCTACCTATAAGGATAACCACTATGGTAGCCAAACAGGAACAAGTATGGCC 

TCTCGTCAGATTGCTGGCGCCAGCCTTTTGGTCAAACAATACCTAGAAAAGACTCAGCCAAAGTTGCCAAAAGAA^ 

GATATGGTTAAGAAGCTATTGATGAGCAATGCTCAAATTCATGTTAATCCAGAGACAAAAACGACCACCTCACCGCG^^ 

GGGGCAGGATTACTTAATATTGACGGAGCTGTCACTAGCGGCCTTTATGTGACAGGAAAAGACAACTATGGCA^^ 

GGCAACATGACAGATACGATGACGTTTGATGTGACTGTTCACAACCTAAGCAATAAAGACAAAACATTAGGTTATG^^^ 

TGCTAACAGATCATGTAGACCGACAAAAGGGCCGCTTCACTTTGACTTCTCACTCCTTAAAAACGTACCAA^^ 

CAGTCGCAGCCAATGGAAAAGTGACTGTAAGGGTTAGCATGGATGTCTCACAGTTCACAAAAGAGCTifl^^ 

ATGGTTACTATCTAGAAGGTTTTGTCCGCTTTAGAGATAGTCAAGATGAGCAACTAAATAGAGTAAACATTCC^ 

AAAGGGCAATTTGAAAACTTAGCAGTTGCAG/V\GAGTCCATTTACAGATTAAAATCTCAAGGCAAAACTG 

AATCAGGTCCAAAAGACGATATCTATGTCGGTAAACACTTTACAGGACTTGTCACTCTTGGTTCAGAGACC^ 

AACGATTTCTGACAATGGTCTACACACACTTGGCACGTTTAAAAATGCAGATGGCAAATTTATCTTAGAAAAA/^ 

AACCCTGTCTTAGCCATTTCTGCAAATGGTGACAACAACCAAGATTTTGCAGCCTTCAAAGGTGTT^ 

GCTTAAAAGCAAGTGTCTACCATGCTAGTGACAAGGAACACAAAAATCCACTGTGGGTCAGCCCAGAAAG^^ 

AAAACTTTAATAGTGACATTAGATTTGCAAAATC/^CGACCCTGTTAGGCACAGCATTT^ 

ATTACCAGATGGGCATTATCATTATGTGGTGTCTTATTACCCAGATGTGGTCGGTGCCAAACGTCAAGAAATGACA^ 
ATTTTAGACGGACAAAAACCGGTACTATCACAAGCAACATTTGATGCTGAAACAAAGCGATTCAAACCAG^ 
GTGGATTAGCTGGTGTTCGCAAAGACAGTGTCTTTTATCTAGAAAGAAAAGACAACAAGCGTTATACAGTTACGA^ 
CTAC/W\TATGTCTGAGTAGAAGACAATAAAACATTTGTGGAGCGACAAGGTGATGGCAGCTTTATCTTGCGGGT^^^ 
AAATTAGGGGATTTCTATTACATGGTCGAGGATTTTGCAGGGAACGTGGCCATCGCTAAGTTAGGAGATCA^^ 
IGTAAAACACCAATTAAACTTAAGCTTACAGACGGTAATTATCAGACCAAAGAAACGCTTAAAGATM^ 



TTAGC 
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?Jf?SAAAGTACAGCCTGGTATGGCATAACAGCCCGAGGAAGCAAGGTGATG^ 

TCACGAACATGGTAAAGAACATCAAAAGCAGTACACCATATCTGTGAATGACAA^ 

ACCATTAATGGCGTTGACCAGTTTACTCGTGACAAGACAAAAGCGGTTGAGTCATCAGGCAT^^^^ 

TCGCCAAGAAAAATGGCCGTAAATTTGATGTGACAGAAGGTAAAGATGGT^^^^ 

^aX^TCCAGATGGTTCTTACAGCATTTCAAAAAGAGATGGTGTCACACTGTCAGATTATTA 

^GTGTC.TTTTGCTACCTTGCGTGACCTAAAAGGGGTCGGAAAAGACAAAGGAGTAGTC 

CCTGAAGACAAACAAATAGTGAACTTTACCTACCTTGTGCGGGATGGAGATGGTAAACCGATTGAAAA^^ 

actc^gctX^gt^atcttgcgatacggcaaatacacggtcgaattg^^ 

CAGATAAAATCGTTTCCTTTACCTTGTCAGCTGATAACAACTTCGAACAAGTTACC^ 
ATAAGTGGCGACTTTGATC^^^^ 



ctcaS^aIcSS^Sccta?^^^ 

-r A r-rxnn<^Tr.TTTAnnn(^A A AAAAATCAACCA/VKGATTGA 



TACTTGCGTCTTTAGCCGAAAAAAATCAACCA/VKGATTGA 
SPy0430 

A^o'^AiTn^ArTrr^TTTTATGAAAACAAAATC/Wi^CGCT^ 
rPA^GGACAACAT^^C^^ 

TATCAAGCATTTAGTACTATTTGGACATACTT/SiAGCGGTTTGTTCT/VSi 



SPy0433 

GAACGAGGAAT^^^^ 
GCTATA^^^ 

AGATCAGTGGTGGAACTTTAAGGGACTGTTTCAGTGA 



SPy0437 
Seq ID 22 



ATGA^WWV^CATTAACGTTGCTACTGGGACTCTTTGCCATCGGGGTAACr^ 
I?A?CTCAGlL^TX2?GAAGGTci^^ 

^^?^y^SCXc^CC^^^GGOTAATAAACCTTGGCATCAGCGGTTAAGCGAAAACATCCAA^^^ 

ttaagggactgtttcagtga 

SPy0469 

A^ArS^TCGGrT^CAG^ 

A^rrAcS^lcTGAAGcXATGGG^^ 

^AC^TCCTAACAGCAAACTAC^C^^^ 

™tcZtS^Scctag^ 

ct^GT^GGTlTGGCCATTGA^^ 

?S??5^^cgt^ta1ttcatacP^^ 

GGAACCCCATGCCAGA^^^ 

SPy0488 

l?oroorAGATTCAGTGCATTCGTCTGATAGACGTTTTGGAGTTGGCTTTTG^^ 

T?^?^aag1^^a?^cgc^^ 

TGATGAAAGA^^^ 

^rrfZI^w/^GGTA^ 
^§^^I™CA?G^S?G^^?L^^^^ 
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AGGGGAGGCAGGCCACGATTACTTATGGGGACGGCTCTACGGATATTATTCCTCCAGCAGmA 

aaagagcctactgaagccgatcaatctgtcggaacaccgactcctggtattcctggtX^^ 

GAGCATGAAGCTATGGTAAATGTCGAACCACTGTCTCATGTAGTAAAAGACAATATAA^^TC^^ 
CGGTTTGAGCCTTTTAGACCTAATGAAGATGAGAAGGAGAAGCCTGCCAGCGATG^GGT^^ 

ctggctagaaccagcgacagctcttcctagtgttgaaatgagcgctgaggacagg™^ 

SPy0515 
Seq ID 25 

atgaaagtcttattgtatttagaagcagaaaattatctaagaaaatcaggaattggtcgagcgattaagcatgaggctaaagcct 
tgtcacttgttggtcaacattttacgactaatccaagagaaacttatgatttggttcatctcaat^^ 

CTGGTGATGATAAAAGCACAAAAAGGTGGTAAGAAGGTTATCATGCATGGGGATTCTACAd^<^^ 

ttttttcaaatgtattatctccttggtttaaaaaatacctttgtcacttttacaataaggcagatgctatc^ 

ATTCTAAGTCTTTGATTGAGAGTTATGGAGTGAAGTCAGCTATTTTTGCAGTGTCAAATG^^^^ 

agatcctaaaaaggaagcagcttttggtcgctagtttgacattaaagagggtgaaaaagtggtt^gggag^^ 

CTGAGGAAAGGAATTGATGACTTTGTCAAAGTTGCCCAAGCTATGCCAGATGTTCGTTTTATCTGGTn^CK^^ 

GGGTCATTCCTGGTCAAGTTCGCCAAATGGTCAATGGTAACGACCCGAAAAATCTTArrTTCCCAGGATA^TT^C^^ 

TTATGAAGGTGCCATGACTGGTGCAGATGCCTTTTTCTTTCCAAGTCGTGAAGAAACAGAAGGCATTCTTCT^ 

GCCAGTCGCCAGCACCTTGTTTTACGTGATATACCAGTTTAGTACGGATGGGTTGATCAAAGTAGTGCGGAATTAGCAACCGAT 
ATACCAGGT^^^^^ 

CGTCGCGTAGAAAGGGTTGGGCATGCCTTAGTAGATGTCTAT/WW>AGTAATGGAGTTATAA 

SPy0580 
Seq ID 26 

f!^SS?^^'^^*^'^^^^*^^^^^^°'=^'^®'^'°^CC°TAAAGAAGTAACAGGAAATCTAGATGAAGTGGTG 

GTTACGAACGAGTATTGAAGCAACCGAGAAACTCGCTGATTTGGAAGAGTTGTAGCTGCCTTATAAGGAAAAAGGCCGTAC^^ 

^gcgacgattcxjgcgtgaagcaggcttgttccca™^ 

A$^SZri?Z?f^^^°°'^'^°^^'^°^^A^'^'5^°^TCTAGCAGGAGCTGTGGACATCCTTGTGGAGGCCATC^ 
aXa?I^TT™^^*^'^°'^^^'^'^^^"^^^'^°'3CAGTACAGCCGCTTAGTATCAACGCTTAAAGATGAGCAG 

CGAAAAGTTAGGCATCTTAAAAGTGTCTTTTGAGCATAACrrGGAGAAAATGCAAGGCTTTTTCAGTGTGCGGTTCAAAGAAACC 

AACCCTTATATrGAAGAGGTCATCAATCAGACCATCAAAAAG/W\ATTGTTCCAGCTATGGAAAGACGGGTTCGTTCAGAACTCA 
^TGATGCCGCAGA^^^^^ 

?°TJflTJPP^^°ACCCTGCCTTTCGAACAGGTGCAAA^ 

:^TIT'^9^GTAGCACCAGCTAGCCAAACGAAGATTCAAGCAGCTAAAGAAACACTGACTCAGCTCAT^^ 
GATATTATTGCCATTGGAAACGGAACAGCCAGTCGGGAAAGTG/i^GCCTTTGTAGCAGACGTrrTGAAAGATTTCCCAAATACG 

?^'^j^^CGCTCTGCTATlTCTATT^^ 

TCGGTCAGTACCAGCACGATGTGAGTCAGAAAAAGTTGAGTGAGAATCTTGGTTTTGTTGTGGATACTGTGO^ 

GTGTTAATGTGAACACGGCTAGTCCATCTTTATTAGCACATGTGTCAGGTTTGAAGAAAACGATTTCAGAAAATATTO 
CGCG/^GAAA^^^^ 

AGGTTTTCTCAGAATTCCAGGAGCCAAAAATATTTTAGACAAGACAGGAGTTCACCCCGAGTCCTACCCAGCAGTCAAAGAACT 

CTTTAAGGTAGTTGGTATTCAGGAGTTGGACGACGCTGCCAAGGCAACTTTAGCAGCAGTTC/ikAGTTCCCCAAATGGCAGAAAC 
ATTGGCTATTGGGCAGGAAACCGTTAAAGATATC^^ 

!^f^IST!^GTCAAGATATCCTTGATTTGAAAGATTTGGAAATTGGCCAGAAGCTTGAAGGAACTGTGAGA^ 
I]l°^TGCmGTAGACATTGGTGT^^^ 

TGCCAcScGTGACA^^^ 

SPy0621 
Seq ID 27 

J^^CAACGCTTACGACG^^^ 

fX^^^l^QACGGCTGCCCTCTTACATGACATTGGTCATGGAGCCTATTCTGATACCT^ 

^G;rr?^AGCGTG^^ 

CATCAATACTTAGTTCCAAGTCTGGATGGCAAGCGAAGACCATATCTTATCAGATTTGGCGAGTCGCTTTATTAATCG^^^ 
CTGAAGTCTGTTACCTTTGACCAAGATTCCCAAGGAGAGTTAGAACGCTTGCGCCAATTGGTC(^^ 

AA I eGGCATTTTCACTTTAGTCAGTAA 

SPy0630 
Seq ID 28 

ATTACCAGTGCTGGTAAAGTCACCCCAGAAGCGGCACTAGCCTTATCAACAGCAATC^^^ 
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TTCGCCTATACAGCTTTTGCTGGCGCTCCCGAAACCGCTAAAAAACAATTGCAAAAAGGCAATATTAGAGGCTTC/^ 

CC/^ACGGCACTATCTGGGCTTTCGGTTTTATCGGATTAGGCCTTGGTTTATTAGGTGCCTTGTCAATGGATACTGTGC^^ 

GGTTGATTATATGGCACGTGTCTTGCTCAATGGATTGACTGTCGCTGGTAAAATGTTACCTGGTATGGGATTTGCGATGATCTTA 

TCTGTTATGGCCAAGAAAGAATTGATTCCTTTTGTACTAATTGGTTATGTTTGTGCAGCCTACCTCCAAA^^ 

CGCCATTATTGGTATCATTTTCGCCTTGAATGAATTTTAC/JiACAAAGGTAAACAAGT^ 

CAAGATGACTGGATCTAA 

SPy0681 

TTGACCCCTCGTAGCGGAAAGACCACAGCTGGGCATTTTCGTTATGCTAGGTATGTGATTGAGTCAGAAGATC 

GTGAGTGCTTATAATCAAGAACAAGCTTATCGTTTGTTTATCGACGGCGATGGTACGGGTTTGATGCATATATTTGACG^^ 

GTGAAATAAAACACGAGGAGCGTGGAGATCAGTTGTTAATCACGACACCAAAAGGCAATAAGCGCGTTTATTATA^ 

GTAAAGTT/yi^CAGTGTTGGTGCTATTACAGGTATGTCTTTAGGATCAGTAGTATTCTGCGAGATT/^CTTACTGCACATGGAT^ 

ATCCAGGAGTGTTTTAGGCGTACTTGGGCGGCTAAGCTACGTTATCATCTAGCAGATTTAAATCCCCCAGCACCTGAACATCCA 

GTAATTAAAGATGTCTTTGATGTTCAG/s^CACGAGGTGGACTCATTGGACCATGGATGATAACCCAATACTAACCGCAGAGCGT 

AAACAAAACATTATCAACAGTCTT/WWW^TCCATATCTATACAAACGAGATGTACTTGGACAGCGGGTCATG 

TTATTTATGGCCTTTTTGACACGGAAAAAAATGTTTTGGATGCTTTGATTGGCG^^ 

AGGTCAATCAGATGCCACCTCTATGTCTTGTAATATCGTAACAAGAGTTAGAGATAACGGTAGGATAAGCTTCAGACTTAATCGT 

GTAGCTCACTAGTACCACAGCGGAGCTGACACTGGCCAAGTAAAAGCTATGTCAACCTACGCTTTAGAGTTAAAAGTTTTTATAG 

ACTGGTGCGTTAAAAAGTATCAGATGCGCTATACAGAGGTATTTGTGGATCCTGCCTGTAAATCTTTGAGAGAGGAGCTGGATA 

AGTTAGGAGTATTTACTCTGGGAGCTGGGAACAATTCTAAAGATGTATCTAGCAAAGCAAAAGGTATTGAGGTCGGTATCGAAC 

GCGGCCAAAACATTATCTCAGATGGCGCTTTTTATCTTGTTAATCATAGCGAAGAAGAGTATGACCACTACCACTTTTTAAAAGA 

GATAGGGCTGTACAGTCGTGACGACAATGGCAAACCTATTGATAAAGATAAGCATGCCATGGACGAGTTTAGATACAGCGTCAA 

CGTGTTTGTGCATCGGTATTACAACTAA 

SPy0683 
Seq ID 30 

ATGAAAAAGAAGCCTATTAAGTTAAATGACG/^ACAGCTTCTTTTGGAAGGTAGTCAGTTATCTGATATGTATCATC^ 

TGATTTATTTGATCAAGTGATTGAGAGGATAAAAGCCAGAGGCTCAGCGAGGTTAGCCGAT/VATCCTTATCTTTGGCAAGCTAAT 

AAGTTACATGACGTTGGACTGCTT/V\TGCAGATAACATC/\AGCTTATTGCAAAGTATTCTGGCATTGCGGAAGCTCAACTTCGCT 

ATATTATCAAGAATGAAGGATrrAAAATTTATAAAAACACGTCTGAGGAGCTAGAAGAGGCTCTAGGTAGAGAGTCTGGGGTA^ 

CAGTACTATCCAAGACGACCTATCTAAGTATGCTAGACAAGCTATTGATGATGTGCATAATTTGACTAACACCACCTTGCCATTTA 

GTGTTATAGGAGCTTATCAAGGGATAATCCAAGACGCTGTTGCTGGTGTGGTGACAGGCTTAAAAACGGCTGACCAAGCTATCA 

ATC/WkCTGTGATT/W\TGGTTT/WW\GGGGTTTTATGGTTTTACAGATAAAGCTGGGAGAA^ 

TCGTACCGTTATCAATACTACGACTTGGCGAGTCTTTAACGAAGCCAAAG/Wi.GCCCCTGCTAGGGAGTTTGGCAT^^ 

CTATTACTCAAAAAAAGCTACAGCTAGAGAGATGTGTGCACCTTTGC/V\CATCAAATTGTCACTACTGGCGAAGCGAGAG^^ 

AGGAGGGAT/WW\TCTTAGCTTTATGTGATTACGGGCATGGTGAGCCTGATGGATGCTTGGG/VATCAACTGCAAGCACACTAA 

AACGCCGTTTGTCGTCGGTGTG/\ATAGTAAGCCAGAATTGCCAGAGCATCT/WWV\TATCACTCCTGCAC/^AGCT/W\GCTA^ 

TGCGAATGCGCAAGCTAAGCAGAGGGCAATGGAGAGATCAATACGTAAGAGT/WkGAGCTACTGCACGTTGCGAAGCAATTGG 

GTGATAAAGAGTTGATTAGGCAATATCAATCGGATGTTAGAAGTAAACAAGATGCACTCAATTATCTGATAAACAACA^^ 

TTACATCGCAATC/^AGCCAGAGAAAAGCGTTACAATAATCCTTATACCAAAACTCAAAGTG/Su^GTCG/^^ 

CTAAATTAGATAAACGTAGGGATGTTGAAAGTGCTATAATAGGAGTAGAAACTAGTG/V^GGGATACCGCTAAAAATA^ 

TTTAGCCGAAAGGGCGGTGCTGAGAAATATAGCACCTATTGATATTGTCGATTCTATAAAAGAACCGTTGAAGATAGCTCCTAT^ 

AAGTACGATAACCTTGATAGACCTTCCCAGAAATACATTGGTAAGTGTGTCTCGACAGTAAT/WKCCCGATAGACGGAAATATTG 

TTACAGTTCATGCTACTAGCACGAGAATCCGCAAAAAATATGGAGGAAATTGA 

SPy0702 
See ID 31 

ATGAGCAGAGACCCAACACTrATTTTAGACGAGTCAAACCTCGTTATTGGTAAGGATGGACGTGTGCATTACACATTTACCACAG 

AGGACGACAACCCAAAAGTCAGAGTAGCTAGCAAGTGTCTAGGCACAGCGCATTTTAATCAGCTCATGATTGAGCGAGGAGAC 

CAAGCTACTAGCTATGTTGCGCCAGTAGTAGTTGAGGGTAGAGGTAATCCGACTGGACTATTTAAAGAGCTCAAAGAGATTAGC 

TTAGAGCTGACAGATACTGCTAATTGCCAGGTTTGGTCAAAAATCAAGCTGACTAACCGTGGTATGTTGCAGGAATACTACGAC 

GGTAAGATCAAGACCGAGATAGTCAACTCCGCCAGAGGTGTCGCTACACGTATCAGCGAGGATACTGATAAAAAGCTAGCGCT 

CATCAATGACACCATTGATGGTATCAGGCGTGAGTATCGAGATGCTGATAGGAAGCTATCCGCAAGCTATCAGGCAGGCATCG 

AGGGGCTAAAAGCCACAATGGCCAATGATAAAATCGGTTTAC/V\GCTGAGATT/W\GGCTCAGCAC/^GGGCTATCGCy^^ 

ATGATGATGAGTTGCGCAAGGTATCGGCTAAGATCACAACAACCTCAAGCGGCACTAGAGAGGCCTACGAGAGTAAGCTTGCG 

GGCTTACGTGCTGAGTTTACTCGCTCAAATCAAGGCACGAGGACAGAGCTCGAGTCACAAATTAGCGGGCT/V\GAGCGGTACA 

GCAGTCAACAGCTAGCCAAATCTCTCAAGAGATTAGAGACCGTGAAGGTGCTGTCAGTCGTGTGCAGCAGAGTTTGGAGAGTT 

ACCAAAGGCGGATGCAGGACGCAGAAGAAAACTATAGTAGCTTGACCCATACGGTTAGAGGGGTACAGAGCGACGTTGGATCT 

CCGACTGGTAAAATCCAATCGCGCGTTACTCAACTAGCAGGACAAATTGAGCAGCGGGTTACTAGAGATGGTGTCATGAGTATT 

ATTAGTGGCGCTGGAGACAGCATTAAATTAGCTATCCAAAAGGCTGGCGGCATTAATGCCAAAATGTCTGGTAATGAGATTATC 

TCAGCAATTAACCTCAACTCGTACGGAGTAACAATCGCAGGTAAACACATCGCTCTCGATGGGAATAGGACGGTTAATGGCACC 

TTTACCACAAAAATAGCCGAGGCTATGAAGATTAGGGCTGATCAGATTATTGCAGGCACGATTGACGCTGCTAGGATTAGAGTG 

ATTAACCTTAACGCAAGTAGTATCGTTGGTTTAGACGCT/VKGTTTATCAAAGCTAAAATTGGCTAT^ 

GGGT/V^GGTCATTAAGGCTCGTAATGGAGCGATGCTTATCGACTTAAATACAGCT/V^GATG GACTT TAATAGGGA 

T/\ATTTTAATAGC/MVW^CAATGCCTTAGTACGTAAAGATGGCACACATACTGCCT^ 

GGTTATACAGGGTCAGCGTTGTATGCATCGATCGGGATAAGCTCATCTGGTGACGGTGTTAACTCGGCTTCTTCCGGTCGTTTT 
GCAGGGCTAAGGTCATTTAGGTAGGGTAGGGGATATAATCACACTGCGGGAGTCGACCAGACTGAAATTTACGGTGATAATGTT 
TTAGTTGTGGATGATTTTAATATTACTCGGGGATTT/^AGTTTAGAGCAGAC/V^GATGCAAAAAATC 

TGCGGCTGTAGTAGCCTTAGGCCGCTGTTGGGGGCACTTGGCTAACGTCGGCTGGAATACTGCTCATAGCAATTTTACAAGTG 
CTGTGAATAGGGAATTGAATAACTACATCACAAA/^TTTAA 

SPy0710 
Seq ID 32 

ATGACCTTTTTAGATAAAATTAAACAAGGCTGTTTAGATGGCTGGGCTAAGTAC/WV^TC^ 

CTATCTTAGAGAGCGGGTGGGGCAAACATGCCCCACACAACGCTCTGTTTGGTATTAAGGCAGATAGCTCTTGGACTGGT/^AAT 
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CATTTGATACCAAAACCCAAGAGG/s^TATCAAGCAGGTGTTGTCACGGATATTGTGGACCGATTTAGGGGGTATGATAGTTGGG 

ATGAGTCGATAGCTGATCACGGACAATTTTTAGTTGATAATCCACGCTATGAGG CAGT TATTGGGGAGACTGAGTATAAAAAGG 

CTTGTTAGGCTATT/WS^GCAGGTGGATACGCTACGGCAAGTAGCTATGTCGAACTTTT/\ATCC/\ACTGATTGAGGAAAACGACT^ 

AC/Ws^GTTGGGATAGAGAAGCTCTT/WWS^TAATAAGGAGGAAACGATGACAACCGC/WSiGGAAATTG^^ 

CTTGCT/\ATTCAGGCATGGGTGTTGACAAAGACGGTGCTCACGGGAGGCAATGCTGTGACTTGCCTTGrrTTGTCGCT/W^ 

TGGTTTGGTGTTGATCTTTGGGGC/s^TGCGATTGATTTATTAGACAGCGCAAGTGCGCAAGGCTGGGAAGTCCATCGTA^ 

ACAGAGGC/VSu'KCCCAAAAGGAGGCGCTACATTTGTCCAATCAGTGCCGTATCATC/iiATTTGGACATACGGG^ 

GATAGTGACGGTTACACCATGCGCACTGTCGAGC/W^ACATTGATGGCAATCCTGATGCTTTGTATGTCGGTGCACCAGCTCGT 

TTTAACACTCGTGACTTTACTGGGGTGATAGGTTGGTTTTACCCACCATATCAAGGGGATACAGTCACGC/\ACCAGTCAGCACC 

GAGCCGCAAACTTCTGACACTATCGTAGAGACAGCAAAAACAGGCACCTTTAGCGTTGATGTTGCAGAGATCAATATCAGACGC 

TGGCCAAGTCTAGCCAGCGAGGTTGTAGGTATCTACAAGCAAGGTGATACTGTCAGCTTTGATAGCGAGGGGTACGCTAATGG 

CTATTATTGGATTAGCTATGTTGGAGGCTCAGGTATGCGTAACTACCTAGGTATTGGACAGACTGATAAAGATGGGAATCGCAT 

CAGCCTTTGGGGTAAATTAAATTAG 

SPy0711 
Seq ID 33 

ATGAAAAAGATTAACATCATCA/W^TAGTTTTCATAATTACAGTCATACTGATTTCTACTATTTGACCTATCATCAAM 

AAGAAAGACATTTCGAATGTTAAAAGTGATTTACTTTATGCATACACTATAACTCCTTATGATTATAAAA^ 

TCAAGGACACACACATTAAACATTGATACTCAAi^i^AATATAGAGGGAAAGACTATTATATTAGTTGCGAAATGTCTT^^ 

TCAAAAATTTAAACGAGATGATCATGTAGATGTTTTTGGATTATTTTATATTCTTAATTCTCACACCGGTGAGTACATCTATGGAG 

GAATTACGCCTGCTCA/W»iTAATAAAGTAAATCATAAATTATTGGGAAATCTATTTATTTCGGGAGAATCTCA^ 

AACAAAATTATTCTAGAAAAAGATATCGTAACTTTCCAGGAAATTGACTTTAAAATCAGAA^ 

TATGAGGCTACTTCTCCTTATGTAAGCGGCAGAATCGAAATTGGCACAAAAGATGGG/^CATGAGCAAATAGACTTATTTG 

CACCAAATGAAGGGACTAGATCAGATATTTTTGCAAAATATAAAGATAATAGAATTATCAATATG/KAGAAC 

ATTTATCTTGAAAAATAA 

SPy0720 
Seq ID 34 

ATGATAACAACTTTTGAAACAATTTTAGATA/W^TAAAAGCTGACC/WSiCTATTATTA 
GCTCTTGGTAGTCAGGCCGGCTTG/WVGAAATTATTGCAC/WkATTTCCCAGAC/WW\GGr^ 

CTAGTTTAGCTTGGATTAGCCAAATGGATCAGGTGACTGAC/W^GACTATAAAGAGGCTTTGGTCATCATTACAGATACAGCG^ 
TCGACCAAGGATTGATGATGAGCGCTACACACTGGGGAAGTGCTT/V\TT/SwaiGATTGATCACCATCCCAACGATGATGTGTATGG 
TGACTTCTATTATGTGGACACAAGCGCTTCTAGCGC/Su^GTG/W\.TCATTGCAGACTTTGCCTTTAGTCAG 

GACAAGGCTGCTAAGCTCTTATACACCGGTATCGTTGGTGATACAGGGCGATTTCTTTATGCCTCAACCACTAGTAAAACCCT^ 
CCATTGCTAGCC/\ACTCAGACATTTCGAGTTCGACTTTGCTGCGATTTCAAGGC/WkTGG 

GCTGCAAAGCTACGTCTTTGAGCATTTAAC/^TTGATGAGAGTGGGGCTGCTTATGTCCTTGTCAGCCAAGAAACCT^ 
TTTGACGTGACCCTAGCAGAAAGCTCTGCCATTGTCTGTGCTCCTGGT/W^TTGATAACGTTCAAGCTTGGGCTAI I I 1 I GTTG 
AGTTAACTGACGGCAACTACCGTGTGCGTATGCGCAGTAAAGAAAAGATTATrAATGGCATTGCTAAGCGTCACGGTGGAGGG 
GGGCATCGCCTTGCTAGCGGAGCC/i^CTCAGCT/^ATTTAGAAGAAAATC/^AGCTATTTTCCGAGAACTCATCGCTGT^ 

GAGATTTAG 

SPy0727 
Seq ID 35 

ATGATTGAAGAAAATAAACATTTTGAAAAAAAAATGCAAGAATACGATGCCAGTCAAATTCAGGTTCTAGAA^ 

TGCGGATGCGTCCAGGGATGTATATTGGCTCGACAGCTAAAGAGGGTTTGCATCATTTAGTCTGGG/W\TTGTTGACAACTCAA 

TTGAGGAAGCATTAGCAGGTTTTGCCTGTCATATTAAAGTGTTTATTGAAGCAGATAATTCCATTACAGT^ 

TGG/\ATTCCAGTTGATATCCAAGCCAAGACAGGACGTCCCGCCGTTGAAACAGTTTTTACAGTCTTACACGCAGGTGGTAAAT^ 

TGGTGGAGGCGGCTATAAGGTTTCTGGAGGATTACATGGTGTAGGGTCATCTGTTGTTAATGCTTTATCAAGAC/SvATTAGATGTA 

CGTGTTTATAAAAACGGCGAAATTCATTACCAAGAATTTAAACGCGGGGCTGTTGTAGCAGATCTTGAGGTCATTGGAACCAGT 

GATGTGACTGGCACGACCGTACACTTTACACCCGATCCAGAAATTTTTACCGAAACGACTCAGTTTGATTACAGTGTTTTAGCAA 

AACGTATTCAAGAGTTAGCCTTTTTGAATCGTGGTTTAAAAATTTGCATTACAGATAAGCGCTCAGGTATGGAAC/N^ 

TTTCCTTTATGAAGGTGGAATTGGTTCTTATGTTGAATTTTTAAATGATAAAAAAGATGTTATC 

TGGTGAATTAGAAGGTATTGCAGTTGAAGTAGCGATGCAATACACGACTAGCTATCAAGAAACAGTCATGAGTTTTGCTAATAAT 

ATTCATACTCATGAAGGTGGAACGCATGAACAAGGCTTTAGAGCGGCTCTTACTCGGGTCATC/iATGACTACGCTAAGAAAAAT 

AAAATTCTTAAAGAAAATGAGGACAATTTGACAGGAGAAGATGTTCGTGAAGGTTTGACGGCGGT/^T^ 

ATCCTCAATTTGAAGGTC/W\.CCAAAAGAAAATTGGGCAACTCAGAAGTGGTTAAGATCACT/^TCGTCTCTT^ 

TCAACGTTTTCTTTTGG/W\ACCCAC/s^GTTGCTCGTAAGATTGTGGAA/W\GGGATT^ 

AAGCGAGCCCGCG/\AGTCACCCGCAA/W\ATCAGGCTTAGAAATTTCAAACTTACCTGG/\AAAT^^ 

GACGCTAACCAAAACG/VkCTTTTCATCGTCG/\AGGAGATTCAGCGGGTGGGTCGGCCAAATCAGGTCGTAACCGAGAGTTTCA 

AGCTATCTTGCCTATTCGCGGT/WV\TTTTGAACGTGG/WWs^GCAACTATGGATAAGATTCTTGCCAACG 

CTCTTTACCGCTATGGGTACAGGTTTTGGTGCAGATTTTGACGTGTCAAAAGCTCGCTACCAAAAGCTGGTTATCATGACC^^ 

GCCGATGTGGATGGCGCTCATATTAGAACCTTACTTTTAACCTTGATTTACCGCTTTATGAGACCTGTTCTAGAAGC 

TTTACATCGCCCAGCCACCTATTTATGGTGTTAAGGTCGGTAGTGAGArrAAAGAGTATATTCAGCCAGGTATTGATCAAG/V^ 

CC/W^TTAAAAACAGCTCTTGAAAAATATAGTATTGGTCGTTCAAAACCAACTGTTCAACGTTATAA^ 

GACCATCAACTTTGGGAAACTACTATGGATCCTGAAAATCGTTTGATGGCGCGTGTGACAGTTGATGATGCCGCAGAAGCAGAT 
AAAGTATTTGATATGTTAATGGGAGATCGTGTTGAACCAAGACGTGATTTCATTGAGGAAAATGCGGTT^ 

TTTAG 

SPy0737 
Seq ID 36 

ATGCGTAAGGTCAAAAAAGTCTTTGTTAGTTGATGTATGCTTTTAACAGTGGGCCTCGGAGTTGCCGTACCm^ 

AATCTAATGGCGTGATGGTTGTAAAGGCTGGGGAAGTGCCGGCGACAGATTTATCACGTCAGGCGTCTGATTCGGAGAGGGTA 

GATGAATCGTCTTTATTGCAGAAAGAAAACTTATCAGTAGATTCATTTAAATTAGAAAATTTA^ 

TACAGCAGGTAATTTGGGGAAATTTAAAGATCCAGATAGTTCGGGCTATCAAAATATTTTGACATCATCTGG/W\G 

GTAGGTGTTGCTCCC/W^GGTTCAGGTA/WKTGAACATTAAAGTAACTAAAAGATCAAATTTTCAGGGTGGATA^^ 

GTCTTAG/VVCTCAAACTCCGGTATrGAAGTTAAATGATGTTTATCGATATTCTTTTAC/\ACT^^ 
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AGTTCAAAACGAGAGTTAAGCCCGTTGAATCTAATAATAAACTAGGGAAAGAGCTTGTTATTAGGGTGGATAATAAA^ 

TACTAAGCATGATTGGCTTCCAGACATGTCTGATGGAACTCATACTGTGGAGTTCACTGGTCTTGATAAAAAATTATGTGTTGCTT 

TCAGATTTTGTCCAAGAGAAACTTCGAATGTTGTTTAGGAATTTTCTAACATAAATATAAAAAACAT^^ 

GCTATTCCTrGG/W\GTTTTAGAGGGAACCAGCGTCTTGTCGGGTACTGCAATATCTTCTGGAGATACATTAG/WWV5iG^^^ 

CGTTTGATGGCGATATCCTAAGAGTTTATAAAGATAGCAAAATCATTGCTAGAAGAGTAATAAAAGGCAATAAGTGGGATGT^ 

ACTTTCAAAGCCTCTTATTGCAGGTGAAAAATTAGATTTTGAGATTTTGCATCCGAGATCTCAAAACGTTAGTA^ 

AACAAGTCGAAGCTAAACCATTTGATCCAGCTTCCTATAAAGAAAAAGTTATAGCCAAATTAAAGCGGGT^ 

TGAAAAAATCACAAATGATGCTTGGTTGGATGAAAATGCG/^GGATTTGC/WWiiCAAAA^ 

AAAGTAGGGATATGAGAGGCTGGAACTAAACAAGAAGCTATAGATGCAGCATATAATAAATATTC>^^^ 

CTGTTCCTAGTCAGTATAAAC/y\GGT/!^TAAAGAAAATGAAC/^GAAAAAGGGCGTCAAGAT^ 

GTTGAAAGCCATTCAAGAAGAC/WVTGGCTAACAGAGCAGGAGAAAAC/^TTC/WWkGAAG/\^^ 

GGTATAGAAAGTGTTAATC/W^CAGTATCATTAGAACAGTTGAAGC/VkCGGTTAw^TAGTGTAT^ 

AG/yVW^GAATATCCTGAGTCAATTCGTAATCAGCATATTCCAGGG/W\G/W^ 

TAAAAAATTACATGACAGAACTCTTGAAAAAATCAATCAAGATAAATGGTTGACGCCAGACCAACAAGGTGAAGAGT^ 

GCGG/W\GTTACTTTTAAAAAAGGCC/^AG/\AGC/\ATTAAAAGTGGTCAGACTTT 

TTTCTGAG/VKTG/VS^GGTAAGGGAAATTCTATrCCCGAT/W\TACAAATCTGGCAATA^ 

TAAACTTAAGGAAGCTCACG/KAGCTACTAAACAAGCAATTGAAAAAGATCCATGGTTGAGTCCGGAACAGAAA/^ 

AGAAAAAGCCAAAGCAAGACTAGATGAGGGCTTG/W^GGTCTTAAAGCTGGAGATAGTTTAGAGATTCTTA/^ 

AGCTTTCGTTGAT/WS^GAAAAAAATCCAGATTGAATTCCAAATGAACATAAAGCTGGAACTGCTGATCAAGCTA 

TTAGATAGTTTAGATAAGGAGGTTCAAAAAGAGTTAGAGTCAATTGATAACGATAATACTCTAACAACTGATGAGA^ 

CTAAG/WWiAGTCAATGAGGCTTATGATGTAGCTAAGCAAACAGCTATGGAAGCCAATTCTTATGAAGATTTGACT^^ 

GATGAGTTCTTATGTAATTTACCTCATAAACAAGGAACGCCGGTTAAAGATC/KACAATCTGATGCTATTGCAGAATTAGAG^ 

AGGAGCAAGAAATTGAAAAAGCTATTGAGGGTGATAAAACATTACCAAGAGAGGAAAAAGAGAAACAAATTGCT 

AAGGCTTAAAATGTGACACGCAAAAAGTTAAAGATGCTAAAAATGCTGATGGTATTAAAAAAGCATTTG^^ 

TATTCCTCAAGCAGATATCCCAGGTGATTTGAACAAGGATAAAGAAAAAGTTGTTGCAG/KATTGAAGGAAAAAG 

GAAAAAGCTATTGATGTTGATAAAACTCTGACAGAAGATGAGAAAAAAGAGCAAAAAGTCAAAACAAAAGCTGAACT^ 

CT/W^AGTGATGTTAAAAATACTCAGACACGTG/^AGAACTAGATAAAAAAGTTGCAGAACTTAAGAAAGCTATTG/^ 

GGTTAAAGGTAATGTTGAAGGTGTTAAG/^T/i^GGGTATTGAAGATGTT/WWV\GGTGATACTGAAAGAGT^ 

GTGATGATACCGTTGAG/W\GCTACTAAAGAAGGTG/V\GTGAAAGAAGCTGAGAAAGGTTTGGCAGGAGGTAA^ 

CTAAAGCAGATGATGCTGATAAAGTAAGTACAGCTGTTACAGAGGACACACGAAAAATTAAAGCAGCACAT/WSACTGGTGACC 

TTAAAAAAGGTCAAGTAGATGCTAACACAGCTCTTGACAAAGCAGGTGAAAAAGAACGTGGAGAAATCAATAAAGATO^ 

AAGGAGAGAAGATAAAGGAAAAC/\AGTGAAAGAAGTTGAGACAGGTGTTACTAAAGCT/WiiGATAACGTG/s^ 

AGGAGAGGGTATGAATGAGGCACGTGATAAAGGGGTAGCAAG/^ATTGATGGGGTGCATAAAGCAGGTCAAGACTTAGGTGCTC 

GTAAGTGAGGTCAAGTGGGTAAAGTTGAAGAAGCAGCT/W\GCAACGAAAGAC/^AGATTTCAGCTGATCC/W5^CT^ 

AAGAAAAAG/\AGAGGAATGTAAAGCTGTTGACGGTGAACTTAAGAAAGCGATAGAAGCTGTTAACGCAGCTGACACAGCTG 

AGGTTGAGGATGCTGTTGGTGAAGGTGTTACAGACATCAAGAACCAACACAAGTCAGGTGACTCTATCGACGCTCGTCGTGAG 

GCTGATGGTAAGGAACTTGATAGAGTCGCTCAAGAAACTAAAGGTGCGATTGAAAAAGACCCTACTTTGACGACTG^^^ 

GCT/W\C/VKGTTAAAGACGTAGATGCCGCT/WKGAAAGAGGCATGGCT/VS^GCTTAATGAAGCTAAAGATGCT^^ 

AAAGCTTAGGGTGAAGGTGTTACAGACATCAAGAACCAACACAAGTCAGGTGACCCTGTCGACGCTCGTCGTGGATTACACAA 

G/KAGTCAATCGACGAAGTGGCGC/\AGCAACT/\AGGACGCTATCACAGCAGATACGACTTTGACTGAAGCTGAAA/^ 

AAGGTGGGAATGTTGATAAAGAAGC/\ACTAAAGCT/WKGAAG/\ACTTGCT/V^GGCTAA^ 

ACGGTGACGGTGTAACCAGCATCAAGAACC/^CAC/VAGTCAGGTAAAGGTCTTGACGTTGGTAAAGATGAGGACAAG/W\GGT 

CTTGAAGCTGTTGCTAAACGTGTCACTGCTGAAATTGAGGCTGATCC/^GCTTAAGAGCAGAAGTGAGAGAACAAGA^^ 

GAGGTrC/WWKGAGCTTGAACTTGCGACTGATAAGATTGGTGAAGGTAAAGATGGAGATGAAGGAGACAAAGCTTAGGGTGAG 

GGTGTCACAGCGATCGAAAATGCCCACGTTATTGGTAAAGGTATCGAAGCTCGTAAAGAGCTTGGT/\AGAAAGACGTTGGTG/\^ 

GCTGCTGCTAAGACAAAAGCTCTCATTATTGAAGACAAAAGGGTTAGTGATGATCAAGGTAAAGAACAGTTATTAGGTGT^ 

CAGAGTATGCTAAAGGTATCGAGAATATTGATGGAGGTAAGGATGCTGGAGGTGTTGATAAAGGATATAGTGACGGTGTTGGTG 

ACATCCTGGCAGAGTAGAAAGAAGGTCAAAACCTT/V\TGATCGTCGTAATGGTGCCAAAGAATTTGTTCTTAAAG/W\^ 

AGTGACGAAACT/^TCAATGATGATCCAACCTTGACTCATGACCAAAAAGTTGATCAAATGAACAAAGTTGAAC/^ 

GACGCAATGAAGTGTGTTGATGATGCTCAAACAGCTGATGCTATGAATGATGCTCTTGGT/VKGGGTATTGAAAACATC/\AG^^ 

AATACG/VVGATGGCGATGGCGTTGATGTTCGTAAAGCGAGTGGGAAAGGCGATGTTG/WW^GAAGCTGCTAAAGTGAAAG 

TTATTGCT/^GGATGCGAGGTTAACTCAAGCTGATAAAGAGAAAG/WiiGAGCAGCGGTTGACGCAGGT/^AGAATACAGCAA^ 

CAGGGGTTGATAAAGGGAGAACAACTGAAGGCATT/^CG/V\GAACTTGGTAAAGGCATCACAGGTATC/VATAAAGCTT^^ 

CAGGTGAAGGTGTTAAAGGACGTAAAGAAGCCGGTAAAGGTGATCTTGAAAAAGAAGGTGCTAAAGTGAAAGCTCTTATTACTA 

ACGACCGAACCTTAACAAAAGCTGATAAAGCTAAAGAAAGAGAAGCTGTTGGTAAAGCCCTTAAAGGTGCTATGGGAGC^ 

ATAAAGGGACAACAGCTGAAGGCATTAACGAAGAACTTGGTAAAGGGATCAGAGCTATCAATAAAGGTTAGGGTGGAGGTGAAG 

GTGTTAAAGCAGGTAAAGAAGCCGCTAAGGCTGATCTTGAAAGAGAAGCTGCTAAGGTTCGTGAAGCTATGGGTAACGAGCC/VV 

GGTTAACAAAAGGTGATAAAGCTAAACAAACAGAAGCTGTTGCTAAAGCTCTTAAAGGTGCTATCGGAGCGGTTGi^^ 

GAACAGGTGAAGGGATTAAGCAAGAACTTGGT/W\GGCATCAGAGGTATGAACAAAGGTTACGGTCCAGGTG/V^GGT^ 

GAGATAAAG/V\GCTGCTAAAGCTAATCTTGAAAAAGTAGGTAAAGAAAGTAAAGCTCTTATTTCAGGAGACCGT^^ 

AAGTGAAAAAGGAGTGGAAAAACAAGCTGTTGAGCAAGGTCTTGGGAAAGCACTTGGTC/V^GTTGAGGCTGCT/i^GACAGT^^ 

AGGTGTTAAGTTGGCAGAAAACCTTGGTACTGTAGCTATGCGTTGAGCATATGTTGCTGGTTTAGCTAAAGATACTGATCA^^ 

AGAGGTGGTCTTAACGAAGCGAAACAAGGTGCTATTGAAGCTCTT/W\CAAGCTGCGGCAGAAACACTTGCTAAG^ 

GATGCTAAATTGAGTGAAGCTCAAAAAGCTGAAG/^TCAGAAAATGTATCATTAGCGCTT/i^GACGGCT^ 

CAGCAGAATCTATTGCGTCTGTGAAAGAAGCAAAAGATAAAGGTATTACTGCTATCCGTGCAGCCTATGTGCCTAAT/^^^ 

TGGG/WVVkTCATGGTCAGGGAACCATCTTCCAAAATCAGGTGATGC/Wi^CTCAATTGTTCTTGT^ 

TCTTTTAGGTATGGTGCTTTATAGC/VSiGAAAA/KAG/WKGTAAAGACTAA 

SPy0747 
SSQ ID 37 

ATGATTAACAAGAAATGTATAATACCTGTTTCATTGTTGACACTAGCTATTACGCTTACTAGTGTTGAAGMGTTACT^^ 

AAATTTGACTTATGCCAATGAAATCGTAACACAAAGGCCAAAGAGAGAATCTGTTATTAGTGATAAATCGAATTTTCCCCT^ 

CACCTTACCTAGCAAGTGTGGATTTTGGTGAGAGAAAAACACCTTTGCCAACACCTGATAAAGGAGTAAAAGTAACTA^^^ 

GTCTATTGCTCAAGTAAGAAAGGGGCCTGAAGAAAGACCCTATACTGTTACTGGCAAGATTACGAGTGTGATCAATGGCTGGGG 

AGGCTATGGCTTTTATATTCAAGATAGTGAAGGTATTGGACTTTATGTTTATCCTCAAAAAGATTTAGGATACAGJV^^^ 

TTGTTCAATTAACAGGTACACTTACTCGCTTTAAAGGTGATrrACAACTCCAAC^GGTGACTGCACACAAAAAGTT^ 
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TTTCCGACTTCTGTTAAAGAAGCAGTAATATCAGAATTAGAAACAACAACACCCTCAACATTAGTTA^ 
TGGAGAATTATCAACTGATCAATAT/VACAACACATGTTTCCTTGTAAGGGATGATAGTGGTAAAAGTATAGTTGTTC 
ATCGTACAGGGGTTAAAGGGGCTGATGTTGTTACTAAAATAAGTCAGGGTGATTTGATTAACCTCACAGCCATATTGTCTATTGT 
TGATGGTCAATTACAATT/\AGACCGTTTTCTCTTGAACAATTGG/\AGTGGTTAAAAAGGTCAC^ 

CTCGTAATATTGTGAAAATAGGCGAGATTCAAGGAGCTAGTCATACGTCGCCACTTCTC/WW^AGCGGTCACCGTAGAAC^ 
TTGTTGTCACTTATTTAGACGATTCCAGTCA I I I I l ATGTTCAAGATCTTAATGGTGATGGTGATTTAGCGACTTGAGATGGTATT 
CGTGTTTTTGCTAAAAACGCTAAGGTTCAAGTCGGCGATGTTTTGACCATT^ 

ATGAGGAACGTAAGCAGACTGACCTTACCATCACCCAAATTGTGGCTAAAGCAGTGACC/WiufliCAGGGACAGCTC/\AGTTCCAT 

CACCGGTTGTTTTAGGGAAAGATCGTATCGGGCCAGCCAATATTATTGATAATGATGGCTTGCGTGTGTTTGATGCAGAAC^ 

AGGGTATTGATTATTGGGAATCAATGGAAGGCATGTTAGTGGCGGTTGATGATGCTAAAATCCTTGGTCCAATGAAA^ 

AATTTATGTCTTACCTGGCTGTAGTACAAGACCGTTAAATAATTCAGGTGGAGTATTAGTTCCAGCTAATTGTTATAACACAGATG 

TGATTCCTGTTCTTTTGAAAAAAGGCAAAG/W\TTATTA/iAGCAGGAGACTGTTACAAAGGAAGATTA 

TAGCTATGGTAATTACAAGGTCTTTGTTGATGACAGCAAAA/\CATGCCAAGTTTAATGGATGGTCATCTAAAACCTGAAA 

AAGTTGCAAAAAGACCTTAGCAAGTTAAGGATTGCTTCTTACAATATTGAAAACTTCTCAGCC/V\TCCTTCTTC^ 

GAAGGTCAAACGGATTGCGGAATCCTTTATTCATGATCTGAATGGTGCAGACATTATTGGATTAATTGAAGTCCAAGATAATM 

GGGCCGACTGATGATGGGACAAGGGATGCGACACAAAGCGGGCAACGCGTCATTGATGCTATTAAAAAACTAGGTGGGCCAAC 

TTATCGTTATGTTGATATTGGTCCAGAAAATAATGTTGACGGAGGTCAACCAGGTGGTAATATTCG/y^CAGGATTCCT^^ 

CCAGAGCGCGTCAGGGTTTGTGATAAGCCAAAAGGCGGTGGTCGTGATGCTGTAACTTGGGTTAATGGAGAATTAAACGTTAGT 

GTTGGTCG/y\TTGATGCAACTAAGGCCGCTTGGAAAGATGTTCGTAAATCACTAGCAGCAGAATTTATCTTCCAAGGTCGTAA^ 

TCGTTGTTGTTGCAAATCATTTGAACTCTAAGCGTGGGGATAATGCTCTTTATGGTTGTGTGCAACCAGTCACTTTTAAATCTGA 

GCAAAGAGGTCACGTGTTGGCTAATATGCTAGCAGAATTTGCGAAAGAAGGGGCAAAACACCAAGCTAATATTGTGATGCTAGG 

TGACTTTAATGATTTTGAATTCACAA^GACGATTCAATTAATCGAAGAAGGTGACATGGTTAACTTGGTGAGCCGACATGATAT^ 

CAGATCGGTATTCTTATTTTCACCAAGGCAATAATCAGACCGTTGAT/y\TATATTAGTTTCACGCCATTTACTTGATCACT^ 

TTTGAGATGGTTGATGTGAATTGCCCATTTATGGAAGCTCACGGACGGGCATCAGATGATGATCCATTGTTAGTTCAATTATGATT 

TTGGAAAGAAAATGATAAGGCAGAGTGTTCTAAAGAAAGTGTAAAAGGT/WWWVGTTC/WVKGGA 

GGAGATAGTCTTGTTTATGTGATAACGCTACTAGGAACGGCTAGTTTATTAGTGCCTATTTTATTATTGACTAAAGGCA/^^ 
AATCATAG 

SPy0777 
Seq ID 38 

GTGATTTCTTTTGCCCCATTTTTAAGCCCCG/^GCTATTAAACATTTGCAAGAA^ 
GCACAGCTC/VS^CAAATTG/Sw\GCAATTTATACTAGTGGCC/WV\TATACTTGTATCAGCTTCTGCTGGTTC 

AATGGTCGAACGCATACTTGATA/WVTTTTGAGAGGTGTTTC/^TTGATCGGCI I i I I ATCTCAACCTTTACTGTTAAAGCAGCTA 
CAGAACTGCGTGAGCGGATTG/WVVCAAATTATACTCAC/WVTTGCTCAAACTACAGATT^ 
CAATTGC/VATCTCTTTGTCAAGCTGATATTGGTACTATGGATGCTTTTGCACAG/WVGTAGTA^ 
GCATTTCATCCC/VATTTCGTATCATGCAGGAT/WSiGCAG/W\CAAGATGTTT^ 

GTTTATG/V^TC/WWKGAGGCACCGGTGTTTAGGGCTCTTGTGAAAAATTTTTCTGGTAACTGTAAAGACACT^ 

GAGTTAGTTTATACTTGTTATTCTTTTAGCC/V\TCGACAG/WVACCGAAAAATATGGTTGCAAGAAAAT^ 

/\AGTTACC/W\GAGTTGAAGATATGGGGGATCATGATATTGAACTGTTACTTTTGGG/V\TGCAAGAGACTGCAAATCAGGT/^ 

GATGTGACTGATATGGAAGATTATGGGCAGCTGACTAAGGCAGGTAGCCGATCTGCTAAATACACTAAACACTTAACGATCATA 

GAAAAGTTGTCTGATTGGGTGGGTGATTTTAAATGTTTGTATGGAAAAGCCGGATTGGATCGGTTGATCAGAGATGTGACAGGC 

CTTATACCATCTGGGAATGATGTTACAGTCTCG/WKGGTAAAATACCCTGTTTTTAAGACCTTGCATCAAAAATTA^ 

GCATTTAGAAACAATTTTAATGTATCAGAAAGACTGTTTTTCCTTATTGGAACAGTTACAAGATT^ 

GTTATTTAGCTGTCAAAATACAAGAAAGTGCTTTTGAATTTTCAGATATTGGACACTTTGC/V^TCAAAATT^ 

GATATTCGCCAATGGTATGAGCAACACTATCATGAGGTGATGGTTGATGAATATGAAGATAACAATCATATGCAAGAGGGACTCC 
TGACCTTACTATCGAACGGTCATAATCGCTTTATGGTAGGAGATATCAAAC/KATCGATGTATCGATTTCGGCAAGCCGATGCTCA 
GATTTTTAATCAAAAGTTTAGAGACTATCAAAAAAAACCTGAGCAGGGG/WKGTGATTTTACTC/ 

CAGAGGTGTTAAATGTCAGCAATGCTGTTTTTAGTCACTTAATGGACGAATCAGTAGGAGACGTCTTATACGATGAGCAACATGA 

GTTAATAGCAGGTAGTCATGCTCAAACAGTCCCCTATCTAGACCGTCGTGCTCAGTTATTGCTATATAATAGCGATAAAGATGAT 

GGGAACGCGCCTTGAGATAGTGAGGGTATTTCATTTAGTGAGGTTACAATTGTTGCC/WKGAAATTATTAAGGTTCAC/k^^ 

AGGGTGTCCCTTTTGAAGACATTACGTTACTCGTTTCTTCAAGAACAAGAAATGATATGATTTCTCATACATTCAATGAATATG^^ 

ATTCCTATAGCAACAGATGGTGGGCAGCAAAACTATCTTAAATCTGTTGAAGTGATGGTTATGTTAGATACATTACGCACGATTA 

ATAACCCAAGAAATGATTATGCCCTTGTGGCTTTACTGCGCTGAGCGATGTTTGCCTTTGATGAGGATGATTTAGCAAGAAT^^ 

ACTTC/WW^GAC/V\TGAGCTAGATA/W\GATTGCCTATATGACAAGATAC/KAAGGGCTGTGATTGGAAGAGGTGCTCATGCTGA 

ATTGATTCACGATACCTTGCTTGGCAAGTT/W\TGTTTTTTTAAAGACGTTG/Wiu^G 

TTGTATGACTTGATTTGGAAAATTTTTAATGATCGTTTTTATTTTGAT^ 

TAATCTATACGCATTAGCTCTACGTGCT/\ATCAGTTTG/WWN.TCGGGCTATAAAGGGCTATACCGTT^ 

AGGTACTTGAGACGCAAAATGACTTAGCTGATGTGG/\AGTGGCTAGTCCTA/^AC/\AGCTGTTAATTTA^^ 

TAAAGGTTTAC/WKTTTCCGTATGTATTTATCCTT/VSiTTGTGACAAGCGCTTCTCAATGACAG 

TCGGCAGCACGGTATCGGTATCAAGTACCTTGCAGATATCAAAGGTTTACTTGGTG/W\CAACACTC/W\TTCTGT^ 

ATGGAAACCTTACCTTATCAATTGAACAAACAAGAGTTGCGCTTAGCAACTTTATCAGAAG/W\TGCGCTTA 

GACACGAGCTGAAAAAAAAGTTTATTTTATTGGTAAAGCTAGTAAGAGCAAAAGTCAAG/W^ 

AAACTTTTGCCGCTGGCTTTACGAGAACAGTTATTGACATTGCAAGATTGGCTATTAGC/J^TAGCAGATAT^ 

TCTTTATTTTGATGTTCGCTTTATTGAAGATAGTGATTTGACACAAGAGTCAGTCGGACGACTTCAAACACCACAGT^^ 

CAGATGATCTT/WKGATAATCGTC/\ATCAGAAACAATTGCACGGGCTTTAGATATGTTAG/\AGCAGTGTCTCAA^ 

TTATGAAGCAGCTATTCATTTGCCAAGAGTTCGAACGCCTAGCCAAGTTAAGGGAACTTACGAGCCTTTATTAG 

GTAGATATTATAGAG/WVTCTTCTCGATCGGTATCTGATTTTACTTTGCCACATTTTTCAAAAA^ 

TATTGGATCAGCTGTTCATCAGTTGATGCAGGTGCTCCCTTTGTCAAAACCGATAAATCAACAAACGCTTTTAGACGCTTTAAGA 

GGAATTGATAGTAACGAAGAGGTAAAAACAGGTCTTGATGTCAAAAAAATAGAGTCGTTGTTTTGTGATACAAGGCT^ 

TTTTTGAGACTTACCAAAAACACTTGTATCGAGAAGCGCCATTTGGTATTTTAAAACT^ 

CTACGTGGTATTATAGATGCCTACTTTTTGTTTGATGATCATATTGTATTAGTGGACTATAAAAGAGATAAATAGAAG^^ 

TGAGTTAAAAAAGCGTTACCAACAACAGTTGGAGTTATATGCAGAAGGTCTCACTCAAACGTATAAACTTCGTGTGACTAAGCGC 

TATCTTGTTTTAATGGGAGGTGGAAAGCCAGAAATTGTCG/VAGTTTAA 
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ATGGTAA/WS^CGGATTTT/\MTTACGCTATCAAGGGAGCGCAATTGGCTATCTATGGTCTATCTTAAA^ 
GATTATGTACTTGGTATTTAT^ 

TCTGGTCl M I M I iCAGAA GCAACTAGCATGGG/i^TGGTATCTATTGTATCTCGGGGAGACTTGTTGCGAAAATTGAACTTTTCT 

AAGCACATGATTGTTTTTTCGGCAGTGTTAGGAGCTTT/^ATTAATTT^ I I I I GCGTTGATAA 

ATGGTGTGACTATATCAGGGTATGCTTATCTCTCTCTTTTTC i I i 1 I A TAGAATTAGTTGTTTTAGTGCTTGGAATTGCATTGTTAT 

TGTCG/\ATGT GTTTG TTTATTATCGTGACTTAGCTC/>^GTCTGGG/>AGTACTATTACAAGCAG 

TATCCGATGAGTTTTGTTTTAGATAGGGACCCTTTGGCGGCAAAGTTGTTGATGCTAAATGCAGTAGCAC/^ 

TCGTTATTTATTGATTGACAGGGCC/VKCGTAACGATTTGGCAGATGTCAACCAATTGGTTTTACATTGT^^ 

CATTTGTTATATTATTTATTGGCATCTTTGTCTTTAAGAAAAATGCCGATAGATTTGCGGAGAT^^ 

SPy0839 
Seq ID 40 

ATGACATTTTTATCTGATTTGATATCATTAATGACAAAAATCAGATTATCTTGGGTAATAAAGGCGGGTATTTTTCAA^^ 

GT/kACGATTGCTAATATTGTCTTATCAGAATTTTTCTATTTTATATTAGACGTTACTGGTCAATATCATTTAGAT 

GTGACTTTTTTAAAAAATCCTATAGGAGTTGCTTTATTAGGTGCCTATTTATTTTTATTAGCTGGTTT^^ 

GTCTATATCGAATTATTGCGGATCAAGAAATTAGTTTCTATC i I I M AGAAAACAGTTTTCTTATTACCTAAGGGGGCTTTGGAAA 

AGATTTTCTGGTTACC/y\TTATTACTTTTTTTGCTTTATATCCTATTGACTATTCCAGTCTTACATATTGG 

CTCAAAAGCTTTATCTTCCAGAATTTATTGTTGGGGAATTATCA.^AGATAAGTAGCACAAAGTACTTGCTTTATGG 

CTTGTGTTTTACCTTAACCTAAGATTAGTATATTTTTTACCATTGATAGC/y\TCAACCATCGTACGGTTGCTCAAGCATGGAGAGA 

GAGTTGGCAAAAGACTAAAAAGAAACATGTATTGTTATGGATGAAACTTTTTGCAATCAATG 

TAGCTATTTGCATGATTCTTATTTTTGTTGATATGTTTAA 

CATGGGAACTCATTTTTTTTACTACTATTTTTTTTAAACTCTGTTCAGCAATGATTTTAA^ 

ATGA TGAGC CAAGAAGAAGTAATAAGGCATATGTTGTAATCTTTATCGTGGTTACAGTAGGTTTTGGTTATCAATCTCTTGAACGT 

TTAACTTTTTTTGACACATCTCACTCTAAGACAGTTATCGCGCATAGAGGACTTGTATCAGCAGGTGTAGAAAATTCTCTGG/^ 

CCCTTGAAGGTGCTAAGAAAGCAGGAAGTGATTATGTAGAACTGGATCTAATCTTGACTAAGGATAATCACTTTGTGGTGTGTCA 

TGATAATC GATTG AAGCGTTTAGCTGGAGTAAATAAGACGATTCGCAACTTAACCTTAAAAGAAGTTGAACATCTAAGGAGTCAT 

CAAGGAGATTTTTGAGGGCGTTTTGTTTCTTTTGACACTTTTTATCAAAAGGGT/^AGAAGTTGA^ 

CAAGCCAATTGGTACAGAACGTGGAAATTATGTCGATTTGTTTTTAG/WVCT^^ 

TCATGTCTTTAGATTTAG/\AGTAATAGAAGCTATCAAGAAAA/\AAATCCATC/V\.TTAC 

GATTTTTTGGAGATG/VSiTTTGTTGATTTCTATGTCATTGAAGACT^ 

ATA/\AGAAATTTACGTTTGGACTATTAATGATCCCAAGCGCATAGAGCATTATCTCCTAAAGCCTATTCAGGG/\AT^^ 

CAACCAGCTTTAACTAATCAATTGATT/WKGACTTAAAACAAGAT/iATTCTTAT^ 

TATTAA 

SPy0843 
Seq ID 41 

ATGAAGAAACATCTTAAAACAGTTGCCTTGACCCTCACTACAGTATCGGTAGTCACCCACAATCAGG/\AGTnTTAGTTT 

AAGAGCCAATTCTTAAACAAACTCAAGCTTCTTCATCGATTTCTGGCGCTGACTACGCAGAAAGTAGCGGTAAAAGCA^ 

GATTAATG/W\CTTCTGGCCCTGTTGATGATACAGTCACTGACTTATTTTCGGATAAACGTACTACTCCTGAAAAAATAA/^ 

ATCTTGCTAAAGGTCCGAGAG/V\CAAGAGTTAAAGGCAGTAACAGAGAATACAGAATCAGAAAAGCAGATCACTTCTGGATCTC 

AACTAGAACAATCAAAAGAGTCTCTTTCTTTAAATAAAACAGTGCCATCAACGTCTAATTGGGAGATTTGTGATTT^^ 

GGGAATACCCTTGTTGGTCTTTCAAAATCAGGTGTTGAAAAGTTATCTCA/\ACTGATCATCTGGTATTGGCTAGTCAAGCAGCAG 

ATGGAACTCAATTGATACAAGTAGCTAGTTTTGCTTTTACTCCAGATAAAAAGACGGCAATTGCAGA^ 

AGAAAATGGGGAAATAAGCC/\ACTAGATGTGGATGGAAAAGAAATTATTAACGAAGGTGAGGTTTTTAATTCTTATC 

AAGGTAACAATCCCAACTGGTTATAAACATATTGGTCAAGATGCTTTTGTGGACAATAAGAATATTGCTGAGGTTAATCTTCCTGA 

AAGCCTCGAGACTATTTCTGACTATGCTTTTGCTCACCTAGCTTTGAAACAGATCG 

GAATTAGCTTTTTTTGATAATC/W\TTACAGGTAAACTTTCTTTGCC 

AAACCATATCAAAACAATTGAGTTTAGAGGA/^ATAGTCTAAAAGTGATAGGGGAAGCTAGTTTTCAAGATAATGATC 

GTAATGCTACCTGACGGTCTTGAAAAAATAGAATCAGAAGGTTTTACAGGAAATCCAGGAGATGATCACTACAATAACCGTGTTG 

TTTTGTGGACAAAATCTGGAAAAAATCCTTCTGGTCTTGCTACTGAAAATACCTATGTTAATCGTGATAAGTCACTATGGCA^ 

AGTCCTGAGATTGATTATACTAAATGGTTAGAGG/VVGATTTTACCTATCAAAAAAATAGTGTTACAGGTTT^ 

AC/WWKGT/W\ACGTAATAAAAACTTAGA/^TTCCAAAACAGCAC/VATGGTGTTACTATTA 

GC/\ATGTTGATTTTCAAAAT/WV\CTTTACGTAAATATGATTTGGAAGAAGTAAAGCT^ 

TTTGCTTTTCAATCTAAT/VVCTTGAAATCTTTTG/KAGCAAGTGACGATTTAGAAGAGATTA^ 

TATTG/W\CCTTGGAATT/W\AGATA/\ATTAGTTAGTATTGGTGATGCGGCTTTCCATATTAATCATATTrATGCCA^^ 

AG/VVTCTGTACAAGAAATAGGGCGTTCAGCATTTCGGC/VW\TGGTGCAAATAATGTTATTTTTATGGGA^ 

TTAGGTGAGATGGCATTTTTATCAAATAGACTTG/VACATCTGGATCTTTGTGAGC/VWSACA^^ 

CCTTTTCAGAC/V\TGCCTTGAAAG/Vs^GTATTATTACCAGCATCACTGAA/\ACGATTCGAG/\AGAAGCC^^ 

AAAACAACTGG/VKGTGGCATCTGCCTTGTCCCATATTGCTTTTAATGCTTTAGATGATAATGATGGTGATG/^^ 

AAGTGGTTGTTAAAACGGATCATAATTCCTACGCACTAGCAGATGGTGAGCATTTTATCGTTGATCCAGATAAGTTATC^^ 

ATAGTAGACCTTG/W\AGATrrTAAAACTAATCGAAGGTTTAGATTATTCTACATTACGTCAGACTACTC/^ 

CATGACTACTGCAGGT/WVGCGTTGTTGTCAAAATCTAACCTCCGACAAGGAG/WW\C/WW\TrCCTT^ 

TTCCTTGGCCGCGTTGATTTGGAT/W^GCCATAGCT/W\GCTGAG/\AGGCTTTAGTGACC/\AGAAGGCA^ 

TTGCTTGAAAGAAGTATTAACAAAGCGGTATrAGCTTAT/VKTAATAGCGCTATTAAAAAAGCTAATGT^ 

GTTAGACTTGCTAACAGGATTAGTTGAGGG/W^AGGACCATTAGCGC/V^GCTACAATGGTACAAGGAGTTTAT^ 

GCCTTTGCCATTGCCAGAATATTATATCGGATTGAACGTTTATTTTGACAAGTCTGGAAAATTGATTTAT^ 

ATACTATTGGCGAGGGACAAAAAGACGCTTATGGTAATCCTATATTAAATGTTGACGAGGATAATGAAGGTTATCATGCCT^ 

AGTTGCCACTTTAGCTGATTATGAGGGGCTCGACATCAAAACAATTTTAAATAGT/\AGCTTAGTCAATTA^ 

TACCGACTGCAGCCTATCATAGAGCCGGTATTTTCCAAGCTATCCAAAATGCAGCGGCAGAAGCAGAGCAGTTATTGCCT/\AAC 
CAGGTACGCACTCTGAGAAGTCAAGCTCAAGTGAATCTGCTAACTCTAAAGATAGAGGATTGCAATCAAAGCCAAAAA.CGA^^ 
GAGGACGAGACTCTGCAATATTGCCTAGGACAGGGTCAAAAGGCAGCTTTGTCTATGGAATGTTAGGTTACACTAGCGTTGCTT 
TACTGTCACTAATAACTGCTATAAAAAAGAAAAAATATTAA 

SPy0872 
Seq ID 42 

ATGAAAAAATATTTTATTTTAAAAAGTAGTGTATTGAGTATCGTGAGTAGTT^ 
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GTTGATGTGCAArrCCTTGGCGTCAATGATTTTCACGGCGCTCTTGATAATACCGGAACAGCTTACACACCAAGTGGTAAAA^^ 
CAAATGCTGGGAGGGCTGGTC/V\TTAGGTGCTTATATGGATGACGCTGAGATAGACTTC/^G 

GTATACGTGTTCAAGCTGGAGATATGGTCGGAGCCAGTCCTGCTAAGTCTGCACTTTTACAAGATGAGCCTACTGTCA/VAGTCT 

TT/\ACA/\AATG/V/\ATTTGAATATGGCACTCTTGGTAATCATGAATTTGACG^ 

CAAGCGCGTGATCCTG/V\.TC/\AC/^ATTAATGATATCACC/\AAC/WrATGAGCACGAAGGTTCGC^^ 

ATGTTATTGAT/WWW^CC/\AGGATATCCCCTATGGTTGG/W^CGTTATGCTAT/WV\GACATAGCCATT/V^ 

AAGATTGGCTTGATTGGTGTTGTGACTACAGAGATTGCAAATCTCGTTTTAAAGC/VAAACT^^ 

AGCTGAAACCATTGCOAAATATGCT/WKGAACTAC/V^G/\AC/KACATGTTCATGCTATTGTGGTTT^ 

AGTAAAGATGGTGTTGTTGATCATG/WS^TGGCTACGGTTATGGAAAAAGTG/VKCGAAATCTATGCCGAACATAGGAT^ 

TTTTTGCAGGACATAATCATCAATACACTAATGG/^CTATCGGTAAAACACGTATGGTTCAAGCCCTGTCTCAAGGAA^ 

TGCAGATGTGCGTGGTACGCTAGATACTGATACGAATGATTTTATTAAAACTCCATCAGCAAATGTTGTTGCTGTAGCACCAGGT 

ATCAAAACAGAAAATTCAGATATCAAAGCTATAATA^^ATCATGCTAATGATATTGTTAAAACAGTTAGTG 

TGCAAGTAATTCTTCAACTATTTCTAAAAGAGAAAATATTGATAAAGAATCTCCTGTCGGTAACTTAGCAACAACGGCTCAG 

GTATTGGTAAGAAAACTTTTCCAACTGTTGACTTTGGTATGACCAATAATGGTGGTATTCGAAGTGACCTAGTTGTCAAA^ 

CGGAGCATCACCTGGGGAGCTGCACAGGGTGTACAACCATTTGGTAATATCCTTCAAGTCATTCAAATGACTGGTCAAGACATT 

TACGATGTCCTAAATCAGCAATACGATGAAAACCAGACCTATTTTCTTGAAATGTGAGGTTTAACATACACTTATAG 

TCCTAAGAAGTCTGATACCCCCTTCAAGATAGTTAAGGTTTATAAAGACAATGGTGAAGAAATTAACTTAACA^ 

TTGTTGTCAACGACTTTCTTTATGGTGGTGGTGATGGCTTTTCAGCATTTAAAAAAGCTAAATTAATCGGAGCTATTAACA^^ 

ACTGAAGCTTTCATCACATATATGACAAATTTAGAAGCATCAGGTAAAACTGTTAATGCTACTATAAAAGGGGTTAAAAA^ 

AACTTCAAACCTTGAAAGTTCGAC/WVKGTTAATAGTGCTGGTAAACACAGTATCATTAGTAAGGTTTTTAGA^ 

/V^TACAGTGTCTAGTGAAGTCATTTCAGACCTTTTGACTTCTACTGAAAACACTAATAAGAGCCTTGGGAAAAAAG 

CAAACAAAAATACTATCTCTAGTTCCACTCTTCC/y\TAACAGGGGACAATTATA/WVTGTCTCCTATTATGAC/V\T^^ 

ATAAGCTTAGGTGGACT/W\CGCTTTTATTAAAAAAAGGAAATGCTAG 

SPy0895 
Seq ID 43 

ATGACTAATAATC/W^CACTAGACATCCTTTTGGATGTCTATGCTTATAATGACGCCTTTAGAATTGCTAAAGCCTTGCCA^ 

CCCTAAAAGTGCCCTCTATTTACTAGAGATGTT/W^GAGCGCAGAG/\ATTG/VVCCTTGCCTTTCTAGCGGM 

GAATGGGACCATTG/\AGACCAGTATCACTGTTCATTATGGCTTAACC/SATCGCTTGAAGATGAGCAGATTGCCA^ 

GATTTAGAAGTTAAAGT/V/WWSiCGGTGCTATTATTGATTTCGTCAGGTCAGTGTCGCCTAT^ 

/^TCACGTCAGAAATrCC/W\CTTCAAGGCTTATATTTTTGATACA/Su^G/^A^ 

TGGAATCTGATCACGAGGTTTTCAAGGCTTACCTGTCTC/W\AGCAGTCTCGCAATGTGACGACC/WV\GCTTAGCA 

TGACGTTGACCTCCTTACCTCAGGAAATC/VAGGACrrGGTTTTTTTGTTACGACAT^ 

TCATTTGATT/^GCCTTTTGATGAAGAGGAACTGCATCGCACCACTCATTTTTCTTCTCAGGCT^ 

TGGCGACTTTTTCTGGTGT/\ATCTACCGACGTGAGCCTTTTTACTTTGATGAC 

TTGGAGACAATCTATTGTCTGA 

SPy0972 
Seq ID 44 

ATG/KAGACAACATCCCTGATT/W\GTAGATTTGCCATCAACAATCGGTATAGGTTATGGCGCTTTTTGGCGGTC^^ 

ATCGAGTAGTTAAAGGCAGCCGTGGATCTA/WW^TCTAAAACGACTGCTTT/W^TTTTATCGTCAGACTGCTGA^ 

GGCTAACTTATTGGTCATCCGTAGATACTCAAACACTAACAAACAATCTACTTATACCGATTTTAAATGGGGGTGTAATCAAT^ 

AGGTTACACACCTTTTTAAGTTTAATGAGAGTTTGCCAGAAATAACTGTA/V\GGCAACGGGCCAA/\AGATACTGTTCCGTGGACT 

TGATGATGAGTTAAAAATCACATCTATTACTGTCGATGTTGGCGCTTTGTGCTGGGCTTGGTTTGAAGAGGCTTATCAAATTGAG 

ACCGAAGATAAGTTTTC/^CAGTTGTCGAATCAATCCGCGGTAGTTTAGATGCTCCTGAI Mill I AAACAGATAACAGTCACGTT 

T/V\CCCGTGGTCAGAAAGACATTGGCTTAAACGTGTCTTTTTTGATG/\AG/W\CT/W\CGGGGTGATACA 

ACATTTAGAGTAAACGAATGGCTTGATGATGTCGATAAAAGACGCTACG/V\GATTTGTACAAGACT/\ATCCAAGGCGGGCT^ 

ATCGTGTGCGATGGTGAATGGGGCGTTGCTGAAGGTCTTGTTTTTGATAACTTTGAAGTCGTAGATTTTGATGTTGAA/^^ 

TTCAACGCGTTAAAGAGACCTCGGCCGGTATGGACTTTGGGTTTACTCAAGACCCTACAACTCTTATATGTGTTGCAGTTGACCT 

CGC/W\CAAAGAGTTATGGCTrrACAACGAACATTATCAAAAGGCTATGTTAACAGATCATATTGTC/^^ 

ACTTGCATAGGTCTTACATCGCAGGGGATAGCGCCG/WW\CGCCTCATTGCAGAAATAAAAAGTAAAGGGGTGTCTGGAATTG 

TCCCGAGTATTAAAGGTAAAGGGTCAATCATGCAAGGGATTC/KATTCATGCAGGGGTTTAAGATATATATTCACCCATGT^ 

ACACACAATAGAAGAGTTTAATACTTACACTTTTAAGCAAGACAAAGAAGGTAATTGGTTAA^ 

CACGTTATTGATGCGATTAGATATGCGCTTG/WW\TACCATATCAGAAGCAACGAGTC/W^TCAGTTTG/V\GTTCTTAGGG 
GTTTTGGTTACTAG 

SPy0981 
Seq ID 45 

ATGGCAG/VVG/W\CACAAACAGTTG/W\CGGTTGAAGAGC/\AGTGGTACCAG/5w\GC/WV\C^ 

CACAGATGCAGATGTGGACGCTATCATCGAC/WWVGTTTGCG/\AGTGGAAGTCAGAAC/^AG/s^GCGGAG/W^ 

AA/W^ATGGCTAAGATG/SiATG/WW\GAGAAAGCAGACTACG/WKAGCAG/\AGCTGTTAGACG/^AT^ 

ATAAGACACGCAATGAGTTAACAGCAGTAGCTCGTCAAATGTTTGCAG/s^TCTGAAATCAACGTC/s^ 

AGTTGTGACTTTGGACGCAGAACA/\AC/W^AGCA/iATGTAACAACGCTAGC/WS^CGCATTTGCT/^ 

C/\AGGCTCTTGTACGCCAGACTACTCCGTCAACAGGTGGTGGATTGAGC/W^CAAACCAATTACGGTGCT/\ACTTGGCTAGT/^ 
GGCAGCACAAC/W\GCACC/W^CrrTTTTAG 

SPy1008 

Seq ID 46 

ATGAGATATAATTGTCGGTACTCACATATTGATAAGAAAATCTACAGCATGATTATATGTTTGTCATTTCTTTTATATTCC 

GTTCAAGCAAATTGTTATAATACAACCAATAGACATAATCTAGAATCGCTTTATAAGCATGATTCTAACTTG 

TATAAAAAATTCTCCAGATATTGTAACAAGCCATATGTTGAAATATAGTGTGAAGGAT/WWKTTTGT 

TTGGATATCACAGGAATTCAAAGATAAAGAAGTAGATATTTATGCTGTATCTGCACAAGAGGTTTGTGAATGTCCAGGGAAAAGG 
TATGAAGCGTTTGGTGG/^TTACATTAACTAATTCAGAAAAAAAAGAAATTAAAGT^ 

ACAGGCGCCTATGTTTATTACAGTCAAT/W\CCGAAAGTAACCGCTCAGGAAGTGGATATA/W\GTTAGAAAGT^^ 

A/Vi^TACGATATCTATAAT/ViiCCGGG/V\CAAAAATACTCTAAAGG/\AGTGTTACCTTAGATTTA/^ 

GATTTGTATTATTTTGGC/VATGGAGACTTTAATAGCATGCTAAAAATATATTCCAATAACGAGAG^^ 



wo 2004/078907 



14/45 



PCT/EP2004/002087 



TGTAGATGTGTCAATCAGCTAA 

SPy1032 
Seq ID 47 

GTGAATACTTATTTTTGCACACACCAT/WVCAATTACTACTTTATTC/WVC^ 

TGCCATTTATGCCGATACACTGACTTCA/s^TTCAGAACCTAATAATACTTACTTTCAAACGC 

AAAAAAAGGTAGTACAGCCAC/V^C/WWSiGACTACTATACTGAATTGTTAGACG/^ATGGAAC^^ 

TTATGATAAAACC/^ATCCTGACATGGTCACTTTTCATAATAAAGCTGAAAAGGATGCTC/W^CA^^ 

GTGACCACGAAAATAG/V\CTTACCTTTGGGAACATGCAAAGGATTATTCCGCTTGTGCTAATATCACG/WK^ 

TG/WyWiTAGCAAAACAGATCACT/U\TCCTG/MKTCATGCTATTATCAAGATAGTAAAGCTAT^^ 

CCTTCATGTATGAACACGCTTATAATCTAGATCGTGAAAATGATCAAACAACTGGAAAAGAAAAGAAAGA/^ 

TG/WkTTGGAAGTCCTCGTGCTATTAATAATACCTTATCGTTGATGTATGCTTATTTT^^ 

TCCAATCGAAAAATTTGTGCCTGACCCTACTCGTTTTAGGGTTGGCGCTGGCAATTTTTCACCTTTTGAAG 

TTAATTGATATGGGGCGTGTTAAACTCATTTCCGGTATTCTTGGTAAAGATGATCTCG/yV\TTAGTGATACAATC 

GAAAGTTTTCACGCTAGTTGATGAAGGAAATGGTTTTTACCAAGACGGTTCTTTAATTGATCACGTGGTTA 

GGACTTTATAA/WVKGGCATTGCTTACACTGGAGCTTACGGTAATGTGCTTATAGATGGCTTATGGGAATTAATTGGTA 

AAAAACAAAGTCTCCTATAAAAGCGGATAAAATGGGTACTATCTATCATTGGATTAACCATTCTTTTTTCGCT^^ 

GAGAAATGATGGATATGACTCGAGGGCGTTCTATGAGTCGTTTTAATGCCCAATCTCATGTTGCTGGCATTGAAGCACTTCGTG 

CTATTTTACGTATTGCTGACATGTCTGAAGAGCGTGACCGTTTGGCACTTAAAAGACGTATAAAAACACTCGTCACACAAGG 

TGGTTTTTACAATGTCTATGATAATTTGAAAACGTATCACGATATC/VAACTTATGAAAGAACTACTAAGTGATAGT 

GGAAAAACTTGATAGTTACGTAGCTAGTTTCAATAGTATGGATAAATTGGCACTATATAAT/^ATAAAGAGGATTTT 

TATCAATGTTTTCGAATCGAACTCAA/y\TTATGAAGGTATGAATAATGAAAATCTTCATGGCTGGTT^ 

ACCTATACAATAACGATTTAGGACACTACAGTGAAAAGTATTGGGCAACGGTAAATCGCTACCGCTTAGGTGGAACCACAGAAA 

GTGAGCAAAAACCAGTAGAGGGAACTCCTGAGAATATTAAAACGAACTATCAACAAGTTGGCATGAGTGGTCTCTCTGATGACG 

GTTTTGTTGCAAGTAAAAAACTTAATAAGAGAAGTGGTCTAGCTGGTATGACCTTCACTAATTGGAATAAAAGGCTGACCGTCAAT 

AAAGGGTGGTTTATCTTAGGAAACAAAATAATCTTTGTTGGTAGCAATATCAAAAACCAATCATCTCACAAGGCGTATACAACTAT 

TGAACAACGAAAAGAAAATCAAAAGTACCCTTAGTGTTCTTATGTT/V\C/WS^TCAACCCGTTGAGTTGAATAATG 

TTAGAAACACTAAAAGTATTTTCCTTG/WVGTGATGATCCCGCTCAAAATATTGGTTACTACTTCTTCAAGCC 

ATAAGTAAGGCACTTCAAACAGGG/W>iTGGCAAAACATAAAAGCTGATGACAAATCACCAGAAGCCATC/W^GAAGT^ 

CGTTTATCACTATCATGC/WVACGATACTC/VAGATGGCGATCGTTATGCCTATATGATGCTTCC/W\TATGACTCGTC/^ 

GAAACCTATATTAGC/\AGCTTGATATCGAGTTATTAG/WVACAATGACAAACTGGCCGCTGTCTACGATGATGATAGTC/^ 

TGCACGTCATTCACTATGG/WWWVGCAACGATGTTTTC/WSiTCATAATCTTTCTCATC/^ 

GTCAGGCAAAATCAAC/WTAA 

SPy1054 
Seq ID 48 

TTGGTGACCTTTGGAGGTGCAAGTGCGGTT/\AGGCGGAAG/W^ATGAAAAAGT/\AGAGAGC/VSiG/Wi^ 

TCTGAAAAGCTAGTGGAAATT/\ATGACTTAC/WKCTTT/W\TGGTGATAAAGAGAGTATAGAGTGTGTCGTAGATTA^ 

AAGAGGAA/VACTTGAAGAAG/KATGGATGGAATATTTGAATTCTGGTATTC/KACGCAAAGTTT^ 

GGTG/WVAAGGAGAAC/\AGGTCCTACTGGAAAACAAGGCGAGCGTGGTGAGACCGGCCCTGCAGGTCCACGTGGTGAGAAGG 
GCGAAACTGGTGACAAAGGAGCCCAGGGTCCAGTAGGTCCCGCTGGCAAGGACGGCCAAAAGGGTAAAGATGGTCTTCCAGG 
TAAAGACGGCAAGGACGGCCAAAACGGTAAAGATGGTCTTCCAGGTAAAGACGGCAAGGACGGCC/KAGACGGT/W^GATGGC 
GTCGCAGGTAAAGACGGTAAGGATGGCCAAAATGGCAAAGATGGTCTTCCAGGTAAAGACGGTCAACCAGGTAAACCAGCTCC 
T/W\ACACCAGAGGTCCCTCAAAACCCAGATAGTGCACCACATACTCCAAAAACCCCTCGGATCCCTGGTCAATGAAAAGACGT 
GAGACCTGCTCCTCAAAACCCTTCTAATAGAGGTGTAAAC/W\CCACA/\ACACAAGGTGGTAATGAGCTCGC/\AAAACACCGGG 
AGGTCACGACACACACAGAG/KATTGGGAGGAAGAGGCG/W^CAACC/V^TCCATTCTTTACAGCAGCTGCTGTAGCTATCATGAG 
GACAGCTGGAGTTGTAGCTGTTGCAAAACGTCAAGAAAAC/KACT/KA 

SPy1063 
Seq ID 49 

ATGTATATATTCTCATCGTCAAAA/WKGATAGTGCTAAAGAATTAGTTATCTTGACTCCTAATAGGGAAACTATTTTA^ 

TATTCCAGCCTTTGAGGAAAAGTATGGGGTTAAAGTAAGATTAATCCAAGGTGGGACGGGCCAACTTATTGATCAATTAGGTCG 

AAAAGATAAACCATTAAACGCTGATATTTTCTTTGGTGGCAATTACACTCAATTTGAAAGGCATA^ 

TTCTCCGCAGGTTTCTAGTGTGATTTCAGATTACCAATTGCCTAGTCATCGCGC/^ACCCCATATACGATCAATGGCAGTGTACTG 

ATTGTTAAT/^AGGAATTAGCAAGAGGACTTCATATTACCAGTTATGAGGATTTGGTAC/W\GCAGGTTTAAAAGGCAAAAT^ 

TGGTGATCCCAACAGTTCATCAAGTGGGTTGTGAGAGCTGAGT/VSvTATATTGTTAGCTAAGGGGGGGTACAGAAAGGCTGAGGG 

TTGGGGTTAGATGAAGCGCTTGTTGGTGAATATG/\ATTCTATTAGGGCTACGAGTTCTTCAG/\AGTCTATC/iATGTGTGGGTG 

GGT/\AGATGATTGTTGGGCTAACCTAGGAAGATGGTTGTATCAACCTGCAAAAAAGTGGTGCCAATGTTTCGATTG^ 

AAG/^GGAAGGGTGTTTGTGCCCTCCTCTGTTGGTATTATCAAACATGCGCC/W\CATGACAGAGGCTAAGGTCT^ 

TATGTTATCACGTGATGTGCAA/\ATGGCTTTGGCCAATCAACCAGT/iu^GCGAGGGATTCGTCAAGATG 

CATGAAAGCCTTAG/WKCGATAGCTAGTTTGAAAGAGGATTATGCTTATGTTACCAAGCAC/KAGAAAAA^ 

AACCAGTTGCGCCAACGGTTGG/VAAAAGCTAAGTAG 

SPy1162 
Seq ID 50 

ATGCCGACTAGTATTAAAGCTATTAAAGAAAGGTTAGAGGCCGTTACTAGCCTCTTGGAGCCCGTGTTTCAAGAATTGGC/^^ 

GAGACTAGGTCAGGGGTCC/\AAA^GCTGTA/WiAGCCGACAAAAGGTTATTCAGGCGGAGTTAGCAGAAGA^ 
AGGGATGGTrrGTTATG/WWkGCTGTTTAT/WWHAGGTTATAAAGCGATTGCAGGTATT 

GCAGGTGCCGTTGTGGCAGCTTGTGTGATTTTACCTAAGTATTGTAAAATTAAAGGCCTTAATGATTGTAAA/K^^ 

GTAAGCATGAGACCATTTATCAGGCAGTGAAAGAAAAGGCTTTGGCTATCGGTATCGGTATTATTGACAATCAGCTTATTGATGA 

GGTGAATATTTATGAAGCAACCAAACTGGCCATGCTAGAAGCCATTAAACAGTTGGAGGGCCAACTGACACAACCAGATTAT^^ 

GTTGATTGATGCCATGACATTGGATATTGCTATTTCGCAGCAGTCTATTCTTAAAGGCGATGCCAATTCCTTGTCTATTGCAGCA 

GGATCAATTGTAGCTAAGGTCACCAGAGATGAGATGATGGCTAACTATGATGGGATTTTTCGTGGTTATGAGTTTGGTAA/^ 

GAGGCTATGGCACCAAAGAACATTTAGAGGGATTAA/V\GCTTACGGGATAAGGGGTATGCATGGTAAAAGTTT^ 

ATGGATGTGGTGCGATTC/V^GTAATCGTT/\A 
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SPy1206 
Seq ID 51 

ATGACAGTTAAGGAAG/W^CGATGAGTATTTTAGAAGTT/^AGCAGTTGAGTCACGGTTTTGGGGATCG^ 

TGTCATTTCGCCTCTT/W\AGGCGAACATATTGGACTAGTTGGGGCAAATGGTGAAGG/W\ATCAACCTT^ 

AGGACATTTACAGCCTGACGAAGG/VAAAGTAGAGTGGTCGAAGTATGTCACTGCAGGTTACCTGGATCAACATACAGTGTTGGA 

ATCAGGACAAACCGTTCGTGATGTCTTGCGAACTGCTTTTGATGAGTTATTTAAGACCGAGAATCGTATT/^^ 

TC/^ATGGCAGATGATAAAGCTGATATTGCTGTTTTGATGGAAGAAGTAGGTGAGCTTCAAGATCGTTTAG/^ 

ATACTTTGGATGCT/s^GATTGATGAAGTAGCGCGTGCGGTTGGTGTTATGGATTTTGGAATGGAGTCAGATGTTAC^^^ 

AGGTGGGCAACG/S^CAAAGGTTTTATTAGCCA/U^TTACTATTAGAAAAACCTGATATTCTGCTAT^ 

TGGATGCTGAGCATATTGAATGGTTAAAACGCTATTTACAACATTATGAAAATGCTTTTGTGTTGAT^ 

TAAATGATGTGATTAATATTGTTTATCATGTTGAAAATCAAAGTTTAGTTCGCTATACTGGGGATTATT^ 

ATGAGATGAAACAATCTCAACTTGAAGCAGCCTATGAACGTCAACAAA/^AGAGATTGCTAACTTGCAGGAT^^ 

TAAGGCTCGGGTAGCGACACGTAACATGGCAATGTCTCGCCAAAAAAAACTTGATAAGATGGATATAATTGAAGTTCAAGCTGA 

GAAACCAAAACCAAATTTTGAATTTAAGCAAGCTAGAACTCCCAGTCGATTCATTTTTCAAACAAAAM 

ATTAGGCATTGACGAAAGAAGCCTTAAATATAACGTTTGAAAGAAATCAAAAAATTGGTATTGTTGGGGCC;^ 

ATCTACTTTGCTAAAAAGTTTATTAGGTGTTATTGAGGCTTTAGAAGGTCATATTGTCACAGGGGATTT^ 

TTGAACAAGAAGTGACAGGTGTTAAGCGACAAACTGCGCTAGAAGTAGTTTGGGATGCTTTTCCTGCCTTAAATCAGGCAGi^ 

TTCGAGCGGCACTAGCTCGTTGCGGACTAACATCAAAACATATCGAAAGTCAAATTCAAGTACTTTCGGGTGGTGAACAAGCAA 

AAGTTCGTTTTTGTTTGTTGATGAATCGTGAAAATAACGTGGTTATTTTAGACGAACCAACAAATCATGTTGATAT^ 

AATGAGCTC/y^AGGTGCTTTAA/kAGGATATAAGGGTTCTATTTT/Vi^TGGTTTGTCATGAACCTGA^^ 

CCGATACTTGGGATTTTAGTAAGTTAACCTAA 

SPy1228 
Seq ID 52 

ATGAACAAGAAATTTATTGGTCTTGGTTTAGCGTCAGTGGCTGTGCTGAGTTTAGCTGCTTGTGGTAATCGTGGTGCTTCT/W\G 

GTGGGGCATCAGGAAAAACTGATTTAAAAGTTGCAATGGTTACCGATACTGGTGGTGTAGATGACAAATCATTCAACCAATCAG 

CATGGGAAGGCCTGCAATCTTGGGGTAAAGAAATGGGCGTTCAAAAAGGAACAGGTTTCGATTATTTTCAATCTAC/i^ 

CTGAGTATGCAACTAATCTCGATACAGCAGTTTCAGGAGGGTATCAACTGATTTATGGTATCGGCTTTGCATTGA^ 

TGCT/W»^GCAGCTGGAGATAATGAAGGAGTT/V\GTTTGTTATTATCGATGATATTATCG/\AGGAAAAGAT^ 

ACCTTTGCCGACCATGAAGCTGCTTATCTTGCAGG/\ATTGCAGCTGC/WWKCAAC/W\AAC/W^ 

GGTATGGAAGGAACTGTCAT/VACTCGATTTGAAAAAGGTTTTG/^GCAGGAGTTAAGTCTGTTGACGATACi^^ 

TTGATTATGCTGGATCATTTGGTGACGCTGC/WV\GGAAAAACAATCGCAGCAGCTCAGTATGCAGCAGGTGCTGATGTTAT^ 

ACCAGGCAGCAGGAGGCACTGGAGCAGGTGTATTT/^ATG/VAGCAAAAGCTATTAATGAAAAACGTAGTG/VAGCTGATA^ 

GGGTTATTGGTGTTGACCGTGATCAAAAAGACGAAGGAAAATACACTTCT/WVGATGGC/WVGAAGC^^ 

ATCAATC/W^GAAGTCGGT/\AAGGTGTTCAGTT/^TC/Vi^C/V^GCAAGTAGCAGATAAAA/^^ 

TATGGTCTAAAAGATGGCGGTGTTG/WVTCGCAACTACAAATGTTTC/WSw^G/\AGCTGTT/^ 

AAATT/WS^TCTGGTGACATT/VAAGTTCCTG/WWVTAG 

SPy1245 
Seq ID 53 

ATGAA/\ATGAAAAAAAAATTCTTTTTGTTAAGTCTTTTGGCCCTATCAACT^ 

GGTGAGTCAATAACCGCTGTAGGATCAACAGCACTACAACCCTTAGTAGAAGCAGTAGCTGATGAATTTGGAAGCAGTAATCTA 

GGCAAGAGTGTCAATGTTC/\AGGTGGTGGTTCAGGTACAGGGTTGTCTCAAGTTCAATCAGGAGCTGTCC/KAATTGGAAATAGT 

GATGTCTTTGCGGAAGAAAAAGATGGTATTGATGCTTCTAAATTAGTTGATCATCAAGTAGCTGTTGCAGGACTTGCAGT^^ 

GCAATCCTAAAGTCAAGGTTTCCAATCTCAGTAGTCAGCAGTTGC/^AAAGATTTTTTCAGGAGAATATACCAATTGGAA^ 

TGGAGGAGAAGATCTTGCGATTTCAGTGATCAACCGAGCAGCAAGTTCTGGCTCACGAGCAACCTTTGACAGTGTTATCATG/KA 

AGGGGTCAACGCTAAACAAAGTC/V^GAGCAAGACTCCAATGGGATGGTTAAATCGATTGTTTCACAAACACCAGGTGCCATTTC 

TTACCTTTCCTTTGCCTACGTTGATTCATCTGTTAAATCTTTGC/VATT/W\TGGGTTTAAGGCAAATGC 

ATGATTGGCCAATGTGGTCCTAGGAACACATGTATACCAAAGATAAACCAACAGGGTTGACCAAGGAATTTCTTGATTATATGTT 
TTCAGATGAAGTACAACAGAACATTGTTACACATATGGGATATATTTCGATAAATGATATGGAAGTGGTCAAATCTCATGATGGA 
/WSiGT/Ju\CAAAAAGGT/\A 

SPy1315 
Seq ID 54 

ATGACGCACAAAATAAAAGTATTGCTGCTTGCGATAATGTCTATTI I I I I GACATGCAATATTGCAAGTGCTG/VAACTATTGCTAT 

TGTTTCAGATAGAGCTTATGCCCCATTTG/Su^TTTAAAGACTCAGATCAAATTTACA/VAGG/V^TTGACGT^^ 

TAGGCAAACGTC/s^TCTTGGGATTTCAGTATGAGTTTCCCGGGTTTTGATGCAGCTGTAAATGCTGTT^ 

GTGCTCT/!^TGGCCGGTAC/\ACCATTACGAATGCTCGTAAGAAAGTCTTTCATTTCTCAGAGCCATAT^^ 

GATTGCGACACGTAAAGCC/kATGCCATC/WW^TACAGTGACTT/W\AGG/WWVCGGTCGGTGTT/WV^ 

TC/\AGCCTTTTTG/SiATAACTAT/WWWS^GTATGATTATACTGT^ 

TGCTGGTTCTATTGCCGCTGTTATGGATGATGAGGCGGTTATCC/KATACGCAATGAGGC/W\ACCAAGATATTGCTATTAACATG 
AAAGGAGAGCCCATTGG/^AGCTTTGGGTTTGCTGTC/WW^GGGAAGCGGATATGATTATCTAGTTAATGA^ 
TT/Ws^GCTATGAAAGCTGATGGTACCTACC/\AGCTATCATGACCAAGTGGTTAGGCACAGATGATAAAGCTACCACCAGTCAGG 
C/^ACGGGAAATCCATCTGCC/WKGCTACACCTAC/W\GGACAGTTATAAAATTGTCTCTGATTCGTCTT^ 

TC/WWKTGGTAAGGGCAAATACGTTGGTATTGACATAGAATTAATCAAAGCTATTGCT/W\CAACA^ 

GTAATCCAGGTTTCGATGCTGCCTTAAATGCTGTGCAATCTAGCCAAGCAGATGGGGTCATTGCTGGTGCAACTATTACTGACG 

CTCGTAAAGCTATCTTTGATTTTTCTGATCCTTATTATACTTCTAATATCATTTTAGCTGTTAAAGCTGGAA/^^^ 

ATGAAGACTTAGACAGAAAAACAGTCGGTGCTAAAAACGGGACTTCATCTrACTGTTGGTTAAAAGAAAACGCTCCTA^ 

TTATAATGTCAAGGCATTTGATGATGGTTCTAGCATGTATGATAGCTTAAATTCAGGTTGTGTAGATGCTATGATGGATGATGAG 

GCGGTTGTTAAATACGCTATGTCTCAAGGTCGTCGCTTTGAAACACGTCTTGAGGGCATTTCTACTGGTGAAGTTGGTTTTGCT 

TCAAGAAAGGAACTAATCCAG/\ATTAATCGAAATGTTC/y\C/y\TGGGTTAGCTGCTCTC/^AA^^ 

TATAGATAAATACCTTGACTCTAAGAAAGCTGCAACTCCTTCTGAAAAAGGTGCTGATGAGTCTACTATTTCAGGCCTATTATCAA 

AT/\ACTACAAAGAACTATTGGGAGGACTTGG/V\CCACGCTCAGTTTAACGGTTATTTCATTTGCTATTGCT^^ 

ATGTTTGGGATGATGGCCGTGTCACCAACTAAATCACTTCGAGTTATTTCAACGGTCTTTGTGGACGTTGTTCGAGGGATTCGTT 

TGATGATTGTGGCTGCGTTCATTTTCTGGGGAGTAGC/W\CGTTATCGAGAGTATGACCGGCGAGGAGTCACCGATTAATGATT 

TCTTAGGTGCTAC/V\TTGGACTGTCACTTAATGGCGGAGCCTATATTGGTG/W^TTGTTCGCGGTGGTATCGAAGCTGTTCGAG 
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CAGGGCAAATGG/s^GCTAGTCGAAGTCTTGGTTTGTCTTACGGAACCACGATGAGAAAAGT/^ATTCTCCCA 
TAATGTTACGTAAGTTTATCAATCAGTTTGTTATTTCATTGAAGGATACAAC/W^TCGTCTCAGCM^ 
CAAACAGGT/WiATCATTATTGCTAGAAATTACCAGTCGTrCCGTATGTATGGTATTTTAGCAATTAT^ 
CTCTT/o^C/\AGACTTGCAA/\AOGTTTAG/W^GAGGCTTAACT/VA 

SPy1357 

Seq ID 55 

ATGGG/WVKGAAATAAAAGTGAAATGGTTTTTGCGTAGATCAGCTTTTGGAT^^ 

CAACAGTATCTGCTGTTGACTCACGTATCGAACAGCCTCGAATTATTCCAAATGGCGGAACCTTAACTAATGTTGT^ 

TCCAGAAAAACTGGCATTAGGTAATGAAGAAAGAGCCATTGATGAATTAAAAAAAGAAGCTATTGAGGATAA^ 

GCTATAGAAGCAGCAAGTTCAGATGCGTTAGAAGCATTAGCGGATCAAACAGACGCTTTACAATCAGAAGAAGGTGCGGTT 

AAAGCGGATAACGGTGCTAGTGACGCCTTAGAAGCATTGGCGGATCAAACAGACGCTTTACAATGAGAAGAAGCTGAAGTAG 

CAATCAGATAACGCTGCTAGTGACGCCTGGGAAAAAGCAGCAACTCCAATCGCTTTAGATGTTAAGAAAACTAAAGATACA^^ 

CCTGTAGTTAAAAAAGAAGAAAGACAAAACGTTAATACCCTTCCTACAACTGGTGAAGAGTCTAACCCATrCTTTAC^^ 

CGCTTGC/Vi^T/\ATGGTAAGTACAGGTGTGTTAGTTGTAAGTTC/WiiGTGCAAAGAAAATTAG 

SPy1361 
Seq ID 56 

ATGAAAACGAAAAAAGTTATTATTTTAGTTGGTCTATTGTTATCATCTCAGTTGACTTTGATAGCTTGTCAATCACGAGGTAA^ 

TACATATCCCATTAAAACGAAACAATCAGGTAAGGGAATGACGTCAAACAA/y\TTAAAGGGATTAAAA/^ 

AAGACTCAC/VXAGGTGTGGCGGGTGTCGATTTTCCTACAGATGATGGGTTTATTTTAACCAAAGAGTCAAAAATCTTA 

CAGATCAGGGAATCGTTGTTGACCATGATGGTCATTCGCATTTTATTTTTTATGCCGATTTAAAGGGAAGTCCATT^ 

ATTCCAAAAGGAGCAAGTTTAGCTAAGCCAGCTGTTGCTCAGCGAGCAGCTAGTCAAGGGACTTCTAAAGTAGCAGATCCTCAT 

CACCATTATG/V\TTTAACGGAGCGGATATTGTGGCTGAAGATGCTTTAGGCTACAGGGTTCGCCACGATGATCACTTCCATTATA 

TTTTGAAGTCAAGCTTATCAGGTCAGACACAGGCACAAGCTAAAGAGGTTGCTACTCGCTTGCCACAAAGCAGTAGCCTTGTTT 

CAACAGCTACAGCTAATGGTATTCGAGGCTTGCATTTCCCAACCTCAGATGGTTTTCAATTTAACGGTCAAGGTATTGT^ 

AACA/WSiGACAGTATTTTAGTGGAGGACGATGGTCACTTACATCCTATTTCTTTTGCGGACGTTCGTCAGGGTGGCTGGGGACA 

TGTGGCAGATC/\ATACGATCCCGCT/WWKAGCAGAA/kAGCCAGCAGAAACCCATCAGACACCAGAGCTATCTG/\ACGTG/W\A 

GG/VS^TACC/i^G/WW*iTTAGCTTATTTGGCAGAAAAATTGGGGATTGATCCATC/KACT^^ 

/W\CTTGGTTTGGAATACCCTCACCATGACCACGCACACGTATTGATGTTATCTGATATTG/WS^.TCGGAAAAGAC^^ 

CACATGCTATTGAGCATGCCCGTGAATTGGAAAAACATAAGGTTGG/VATGGATACCTTGCGTGCCTTAGGGTTTGATG^ 

TGATTTTGGATATCGTTCGCACTCACGATGCTCC/VACCCCATTCCCATCAAATGAAAAAGATCCGAATATGATG/WVG/V^TGGl^ 

AGCAACGGTTATCAAACTTGACTTGGGCAGCCGTAAAGATCCTTTGCAACGT/W^GGACTTTCACTGTTACCC^ 

TTAGGAATTGGCTTTACACCAATC/WS^GATATCTCACCTGTTTTGC/SATTT/W\^^ 

GGTGACTGATTATAGATTTTTGGATAATATGCCACAGTTAGAAGGCATTGATATTTCACAAAAC^^ 

TGAGC/W\TAT/WW\CTTAACTCTAGTAGCGGCTGCTGATAATGGTATTGAAGATATTAGGCCGCTTGGTCAATTACG/Wii^^^ 

CAAATTCCTCGTATTGAGT/W\GAATAAGATTTCTGATTTAAGCCCACTGGCATCGTTACATCAATTGCAAGAATTGCACAT^ 

ATAATCAGATTACAGATTTAAGCCCTGTTTCTCATAAAGAATCATTGACGGTTGTTGATTTATCAAGAAATGGTGATGTTG 

GCAACACTTCAAGCACCCAAATTAGAAACGTTAATGGTCAATGATACCAAGGTTTCTGATTTGGATTTCTTGAAAAA 

TCTATCTAGCCTATGTATTAACCGTGGGCAATTGCAATCTCTTGAAGGTATTGAAGCAAGTAGCGTCATTGTGAGAGTAGAAGCA 

GAAGGTAACCAAATTAAATCGCTTGTGCTTAAAGACAAGCAAGGGTCACTTACTTTCTTGGATGTGACAGGCAACCAGT^ 

CTCTAGAAGGTGTTAATAATTTTACAGCACrrGACATTTTAAGCGTGTCTAAAAACCAATTAAC 

AATAAGACAGTTACTAACATTGATATTAGTCATAACAATATCTCATTAGCAGACCTTAAATTGAACGAGCAACATATTCCAGA^ 
CATTGCGAAAAACTTCCCAGCGGTTTACGAAGGTTCTATGGTAGGTAATGGAACAGCTG/VAGA/W\AGCAGCTATGGCTACTAA 
GGCGAAAGAAAGTGCTGAAGAAGGATCGGAATCACATGACTACAACCAT/!^TCATACCTATGAAGATG/\AG/V\GGTCATGCTCA 
CGAGCAGAGAGACAAAGATGATCACGACCATG/V^CATGAGGATGAAAATGAAGCTAAAGATGAGGAAAACGATGCTGACTAA 

SPy1371 
Seq ID 57 

TTGGGAA/\AC/\ATATAAAAATTTAGTGAACGGTGAATGGAAAGTATCAGAAAACGAGATTACCATTTACGCA 

AAGAGTTAGGATCAGTTCCAGCGATGACGCAGGCAGAGGTAGATGGTGTTTACGCTTCAGCTAAAAAGGCTGTATGAGATTGGC 

GCGCTTTGTCTTATGTGGAACGTGGAGCTTACCTTCAT/W\GCGGCTGATATTTTAGTACGTGATGCTGAA/VVG 

TTCTTTCAAAAG/W\GTAGGCAAAGGTCACAAGGCAGCTGTCAGTGAAGTTATTGGTACCGCTGAAATCATT/\ATTAT 

AG/VVGGGGTTCGTATGGAAGGTGAAGTTCTTGAAGGTGGTAGCTTGGAAGCTGCAAGTAAGAAGAAGATTGGTATTGT^ 

TG/VACCAGTTGGTTTAGTTCTTGCGATCTCACCTTTTAATTATCCCGTT/>w^CTTGGCAGGTTCT/\^^ 

GAGG/WJiTGTTGTTGCTCTTAAACCACC/VVCACAAGGCTCTATTTCTGGTTTGTTACTAGCAGAAGCT^ 

TCCAGCAGGTGTCTTTAATACCATTACAGGGCGAGGTTCTGTTATCGGTGATTATATCGTTGAGCACGAAGCGGTTAGCTTTATC 

/^CTTTACAGGTTCTACTCC/VVTTGGGG/\AGGAATCGGT/WVTTAGGGGGTATGCGACC/\ATTATGCTTGAGCT^^ 

GATTCTGCTATCGTTTTGG/\AGATGCAGATCTTGGTTTAGCAGCG/WW\TATTGTAGCCGGTGCTT^ 

GTTGTACAGCGGTT/W\CGTGTTCTTGTGATGGAC/V!^GGTGGCGGATCAATTGGCGGCTGAGATTAAAACACT^^ 

T/VAGTGTCGG/^TGCCTG/itAGACGATGCTGATATTACACCATTAATTGATACATCAGCTGCTGATTTTGTTG/^ 

AGATGCAACTGAT/\AGGGAGCTACTGCTTTGACAGCCTTTAATCGTG/\AGGCAATCTTATTTCACCCGT^ 

ACAACTGAGATGGGTTTGGGATGGGAAGAGGCGTTCGGCCCAGTATTACCAATTATTCGTGTAACCACTGTAGAAG/\AGCCATC 
AAGATTTCTAATGAGTCTGAATATGGTTTGC/^AGGTTCTATTTTTACAAGTAATTTCCC^^^ 
AGAAGTTGG/KACTGTTCACCTTAAGAATA/WVCAC/VKCGTGGAACAGATAATTrCCCATTCTTAG 
GGGGTACAAGGAGTTAAATATTCTATCG/VVGCTATGACAACTGTTAAATCTGTTGTATTTGATATCCAGTA^ 

SPy1375 
Seq ID 58 

ATGAGTCTCAAAGATCTTGGCGATATTTCATATTTTCGCCTAAATAATGAAATTAACCGTCCTGTTAATGGTAAAATTCCACTT^ 
TA/\AGACAAAGAAGCTTTAAAAGCTTTTTCCGCTGAAAATGTGCTGCC/V\ACACCATGTGTTT^ 

agtatttaatctcaaatgattacattgaatcagcttttattcagaaataccgccctgaatttattagtgaat^ 

aatcagaaaattttgggttt/wvtcatttatggcagcctacaagttgtaggagcaatacgccttaaa;^ 

tatttagaaaacgttgaagacggtgtcttgtttaatgctttgtattttggagatggtcaag/\agacttagc/k^^ 

tgaaatgattaaggaacgttagg/iagcggctactgcttcctttttaaatgctggtggaagcggtggtggtg/\attggtgt^ 

ttcttgattc/kagtaactgatgacatg/\actgtatcggacgttctatc/s^ctctgctttgc/vsittatgccgtatt 
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TTGGGATTACCTTGTCTAACCTCCGTGAAGCTGGCGCACCAATCAAAGGCTATGCTGGTGCAGCCTCAGGAGTTGTTCCTGTTA 

TGAAATTATTTGAAGATAGTTTTTGTTATTCAAATCAACTTGGGCAACGTCAAGGAGCTGGTGTTGTTTACCTA^ 

CCTGATATCATTGCTTTCTTATCTAGTAAAAAAGAAAATGCCGATG/WSiAGGTGCGTGTTAAAACCTTGTCA^^ 

TTCCTGAT/W\TTCTACGAATTAGCTCGT/WWVCGAGGACATGTATCTGTTTAGTCCTTACAATGTTG/WV^ 

CCGTTTAAGTATGTCGACATTACC/s^TATGTACGATGAGTTAGTGGCG/VACCCTAAAATTACTAAGACTAAAAT^ 

TCTTG/W^CAGAGATTTCAAAATTAC/iu«\C/VAG/i^TCTGGTTACCGTTATATCATCM^ 

CGATGGA/WV^TCATCATGAGCAACTTGTGTTCTG/Ws^TTTTAC/»^GrrCAAACACCTAGCCTTATCAATGATG 

GTAGAAATGGGAACTGATATTTCATGTAACTTAGGTTCCACTAATATCCTGAACATGATGACCTCACCAGACTTTGGCCGT^ 

TTAAGACCATGACACGTGCCCT/V^CTTTTGTTACTGATTCATCAAGCATTGAAGCTGTTCCAACCATTA/yJiC 

AGCTCATACTTTTGGCCTTGGAGCTATGGGACTACATTCTTACCTTGCTCAACATCATATTGAATATGGCAGT^ 

GAGTTTACTGATATTTACTTTATGCTCGTGAATTATTGGACCTTGGTCGAATCCAATAACATCGCTCGTGAGCGCCAAACTO 

TTGTTGGCTTTGAGAACTCTAAGTAGGCTAATGGTAGTTACTTTGATAAATACGTTACAGGACACTTTGTTGC/^^ 

GTGAAAGATCTGTTCAAAGACCATTTTATTCCGCAAGCTTCAGATTGGGAGGCTCTTCGCGACGCCGTTGA/^^A.^GATGGTCTT 

TATCATCAAAACCGACTAGCAGTTGCTCCAAATGGCTCTATTTCTTATATCAATGACTGCTCTGCTTGTATTCACCCAAT^^ 

ACGCATCGAAGAGCGTCAAGAAAAGAAAATTGGTAAAATCTACTATCCTGCAAATGGTTTGTCTACGGATACCATTGCTTACTAT 

AGATCTGCTTACGATATGGACATGCGCA/V^GTTATTGATGTGTATGCCGCTGCGACCGAACATGTGGACCAAGGCTTGTCATTA 

ACTCTATTCCTTCGTAGTGAGTTGGGTATGGAGCTTTATGAGTGGAAAACACAAAGCA/V\GAAAGCACTCGTGATTTATCCATCT 

TACGAA.^CTACGCTTTCAATAAAGGCATT/\AATCTATCTACTATATCCGTACCTTTAGGGATGATGGGG^ 

CCAATGTGAATCTTGTGTCATTTAA 

SPy1389 
Seq ID 59 

ATGAAAGAATTATCGTCTGGACAAATCCGCCAAATGTGGTTGGATTTCTGGAAATGTAAAGGACATTGCGTTGAGCCTTCAGCTA 
ACTTGGTTCCTGTG/W\CGACCCAACGGTTCTTTGGATCAACTCAGGTGTTGCAACCTTGAAAAAATATT^ 

TCCAGA/^TCCACGTATTACCAATGCACAAAAATCAATTCGTACTAATGATATTGAAAATGTTGGTAAAACAGCACGTCACCAT^ 

CTATGTTTGAAATGCTTGGTAACTTCTCAATTGGAGACTATTTCCGTGATGAAGCTATTGAGTGGGGATTTGAACTCTTGACAAG 

TCCAGACTGGTTTGATTTCCGTAAAGAC/KAGCTCTACATGACTTATTACCGAGATGACAAGGATTCGTATAAGCGTTGGATTGCT 

TGTGGCGTTGAACC/V\GTCAGTTGGTGCCGATCGAGGATAACTTCTGGGA/^TCGGTGCTGGTCCTTCAGGTCCAGATAGGGA 

GATTTTCTTCGACCGTGGTGAAGATTTCGATCCAGA/WSiTATCGGACTTCGCCTCTTGGCTGAAGATATCG/WVVCG 

ATCG/WKTCTGG/\ACATCGTTCTCTCACAATTCAATGCTGACCCAGCCGTACCACGTTCAGAATAC/W\GAATTACC/\^ 

ACATTGATACAGGTGCTGGTCTTGAACGTCTTGCAGCTGTTATGCAAGGGGC/WW\C/W\CTTTGA/Su^CTGACCTC1^ 

CAATCATCCGTG/\AGTAGAG/*^GTTGTCAGGT/W\ACTTACGATCCAGATGGCGACAACATGAGTTTC/\AGGTTATCGCTGA 

AGATCCGTGCGCTTTCATTTGCTATCGGTGATGGTGCGCTTCCTGGAAATG/\AGGTCGTGGTTACGTTCTTCGTCGTCTTCTC 

GTCGTGCGGTTATGCACGGTCGCCGTCTTGGCATC/kACGAAACTTTCCTTTACA/^ATTGGTTCCGACTGT^ 

A/\AGCTACTACCCAGAAGTGCTTG/WW\CGTGATTTTATCGAG/WV\TCGrrA/VVCGTGAGGAAGA^ 

CGATGCAGGTAGCGGTCACTTAGATTCATTGCTTGCGCAGCTTAAGGCTGAAGGT/\AGGATACTCTTG/V\GGT/\AAGATATCTT 

CAAACTTTATGATACTTATGGATTCCCGGTrG/^ATTGACAGAGG/VVTTGGCAGAAGATGCAGGCTACAAGATTGACCACGAAGG 

CTTTAAGTCAGCCATG/WSiGAACAACAAGACCGTGCGCGTGCAGCTGTTGTTAAGGGTGGTTCAATGGGGATGCAAAATGA^ 

CCCTAGCTGGTATTGTTG/\AGAATCACGATTCGAATACGACACATATAGTCTTG/W\TC/\AGTCTTTCAGTCATCATCGGTGAT/VV 

TGAACGTACCGAAGCTGTTTCAGAAGGTCAAGCCCTTCTTGTCTTTGCTCAAACACCATTCTATGCTGA/V\TGGGTGGACAGGT 

TGCTGACACAGGTAGAATC/WWKTGATAAGGGTGACACAGTTGCTGAGGTTGTTGATGTTCAAAAAGCACCAAATGGTCAACC 

TCTACACACTGTAAACGTTTTAGCATCACTTTCAGTTGGAACAAACTACACACTTGAAATCAACAAAGAGCGTCGT^^ 

GAGAAAAACCACACAGCTACTCACTTGCTCCATGGAGCTCTTCACAATGTTATCGGTGAACACGCAACTCAGGCTGGTTCATTG 

/\ACGAAGAAGAATTCTTGCGGTTTGATTTTACTCACTTTGAAGCAGTAAGCAATGAGGAACTTCGTCACATTGAAC^ 

ATGAGCAAATTTGGAACGCTCTTACAATCAGAACGACTGAAACTGACGTTGAAACCGC/W\AGAGATGGGAGCAATGGCGCTT^ 

TTGGTGAGAAATATGGT/W\GTGGTTCGTGTGGTTCAAATTGGTAATTATTGTGTTGAACTTTGTGGTGG^ 

TCTTCAG/W\TCGGTCTCTTCAAGATTGTCAAAGAAGAAGGTATTGGTTCAGGCACTCGTCGTATTATTGCAGTTACTGGT^ 

AAGCTTTTG/V\GCTTATCGTAACCAAGAGGATGCCCTAA/V\GAGATCGCTGCTACTGTA/W\GCTCCGCAATTGAAAGATGCAG 

CAGCTAAAGTACAAGCTCTTAGCGAGTCGCTTCGTGATGTTCAAAAAGAAAATGCAGAAGTTAAAGAAAAAGCAGGAGCTGCAG 

GAGCTGGTGATGTCTTTAAAGATGTTGAAGAAGCTAAGGGCGTGCGCTTCATTGGTAGTCAAGTTGATGTTGCAGATGCAGGGG 

CAGTTCGTACATTTGCTGATAACTGGAAACAAAAAGACTACTCTGATGTGCTTGTTGTGGTAGCAGCTATTGGTGAGAAGGTTAA 

TGTCCTTGTTGC/\AGG/VWKGCAAAGATGTGGACGCTGGTAAGATGATGAAAGAATTGGCAGGAATTGTAGGAGGTGGTGGTGG 

AGGTA/^AGCAGACATGGCTATGGCAGGTGGTAGCGATGGAAGTAA/VATTGCAGAGGTGCTAGGAGGAGTTGCTG/WVTAGTGT 

AA 

SPy1390 
Seq ID 60 

ATG/WWs^GTCAAATA/\AGTGATTGGTAGTGTTGTGACATTGGCCTGAGTGATGGCTrrAGCAGGTTGTC/\ATCAAC 

ATACTAAGGTTATTTGGATGA/KAGGTGATAG/^TTAGCGTTAGTGATTrrrACAATG/^ 

GCGATGCT/WkTCTGGTAATTAGTCGTGTTTTTG/KAGCTGAATATGGTGATAAGGTTTC 

ATAA/\ACAGCTG/KACAGTATGGCGCTTCATTCTCTGCTGGTTTGGCACAATGAAGCTTGACACGTGAGACTTTTAAGCGTC 

CCGCTCTTCAAAATTAGTAGAATATGCGGTTA/VAG/\AGCAGCTA/WW^GAATTGAC/V\GACAAGAATATA^ 

TCTTATAGTCC/\ACAATGGCAGTCGAAATGATTACTTTAGATAATGAAGAGACAGCT/W\TCAGTCTTAGAGG/\ACT/\^^ 

/\AGGCGCAGACTTTACAGCTATTGCT/W\GAA/W^CAAG/i^CACCTGAG/W\/W\GTGACCTAT/^ 

/W\TGTACCGACTGATGTCGT/WSiAGCGGCTTCAAGTTTGAATGAGGGTGGCATATCAGACGTTATCTCGGT^ 

TCTTATC/W^AGAAGTTTTACATTGTT/^GGTGACTAAAAAAGCAGAAAAAAAATCAGATTG 

AGCTATCATTATAGCTGAAA/\ATCAAAAGATATGAATTTCCAAAACAAGGTTATTGC/W\TG 

TTAAAGAGAAAGCTTTTGGT/VATATTTTGGCGCAATATGCAAATCTTGGTCAAAAAACTA^ 

AGCGAATCATC/WV\GGTGGAG/\AGAGAAGGGATCAGAATCAGAGGAAACACAGAGATGATGAGCTGAAGAACGAACTGAGAGT 
GAGGCTCAGACGCAAGAGGGAGGTGGAC/\ATAA 

SPy1422 
Seq ID 61 

GTGCTTTATGCAACACCCATTGCAAAGTTAATTGACAGTTACTCTAAACTTCCAGGAATTGGTATC/\AGACGGCGACGAGATTAG 

CCTTTTATAGTATTGG/V^TGTGAAATGAAGATGTGAATGATTTTGGTAAAAAGTTATTAGGAGCTAAAAGAGAACTGAGGTATTG 

TCGATTTGTGGAAACCTTACCGATGAGGATCCTTGTCAGATTTGCACAGACACGAGTCGTGATGAGACGACCATTCTGGTAGTA 
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GAAGATGCTAAAGATGTTTCTGCCATGGAAAAAATCCAAGAGTATCATGGCTATTATCATGTGCTTCACGGCTTGAT^ 

TGAATGGTGTGGGGCCAGATGACATGAACCTTAAAAGTTTAATTACCCGTCTAATGGATGGTAAGGTGAGCG/iAGTTATCG 

CTACCAATGCCACAGCAGATGGGGAAGCAACGTCCATGTATATTTCACGTGTCTTGA/V^CCAGCCGGGATTAAGGTAACTCG^ 

TGGCAAGAGGTCTCGGAGTAGGTTCAGATATTGAGTATGCTGATG/Su^GT/u^CATTATTGAGAGCTATTGAGAATC 

TT/\A 

SPy1436 
Seq ID 62 

ATGGATATGTCTAAATCA/Sw^kTCGTCGTACTTGGC/Ui^GGTTTAGTTGTTATTTTAATAGGTATTCTCAC 

TGTTACGGCAGCCAGAAAAATTAGAAATTTCCCTGATACCACGGAAATTTTGTTAGGAACGAAGGCGACTGAG.^ 

CTTACCATTGACTGGTAGCTACGAATTAGTTTTGGGCGATGTTGAGAATCTGCAAAGGCCAACCTTCGCACACATCCAGCm 

GATCAAGATGAGCCm^TATTAAACGAA/ykGGACTTAAATTCAATCCTGCTGGGTGGCATAATTACAj^ 

GAAAAAGAAGTTGGTTAATGGACCGTGGCGATTTAGTTGGTTACCAATTTAGCGGGTTAAATGACGAGCGTAAA^ 

AATGAC/Wy\TATCTT/iATACTGGCTTTAGTGACAAAAATGCTTTAGGAATGCTCTATTATGAA^ 

CTCTACACCCTAACTTCTGGCTAGAGTATAAAGTTAGTCCTGTTTATCATAAAAATGAGTTAGTTCCTCGCCAAGTAGTTCTAGAG 
TATGTTGGAATTGATGAAAATGGAGATCTAGTTCAAATTAAGTTAGGTAGTGAAAAAGAAAGTGTAGACAACT^ 
CAGTTAGATTAGATAACGTATCTCCTTTAGGTGAATTGGATTACC/WKCAGGAATGATGGTAGATTGAACTCA/^ 
TAGTAATTTAG/W^CCG/V\GAGTTTGAAG/kAGCGGCTTAA 

SPy1494 
Seq !D 63 

ATGACTAGTAAAAAAGCGTGTTTATCAAGCATCATTGTGTTAGCAAGTTTAACGTGTGGAAATGATACTGTTAGTG 

CTCAGCAACTGGAGATAAGTTTGATGATTGCTCAACACTTGTTGAAAAAGATGTGGCCCCTAAAGATGAACTTGAGATGTTAGCA 

TGGTCGTCGTCTCAAACAACTGATGATGCTGACAGAGACTATGAAGATTTTCTCGATGATGATTCTTTTATTTGTCAAAATGAAA 

TGATAAGATGTTTGAGAATTTAACTGATGATAGGTTATT/W\TGAATTAGATGAATTAGATGAAGAAAATGAAGAAGATG/^ 

ATACAATTGAGCCAGAGCAAAATGTAATAATGCCTAGTGACGATGAGCTATTTGATTTAAGTGATGCTGTTGAGACACGCCTTAC 

TGTTTCTAGTGCTCCCCATTTAGAGGGTGAATTGCCGAAACCACATTTGAGGAGCCTATGAGATACAGCACTGGGGTCTGGTGA 

AATTAGAGGACATTTAGAT/VSiCAAACTGGACGCTTTGTCTGT/V\CAGCTACAAAGTTAGCATTAACGATGGCT^ 

TGAC/VKCGCATGTCTATTCTATAGGTG/WKGCTTTAGTGAAGTATTAGCTGCTCATTATGAAGACAGAAAAGCAGA^^ 

TTCTAAG/WW^GAGATTrCACCTTCCTATTGCTACTCCAGATGTTGTTATAGAGGAGTTAAGGCGCCTAGTC^ 

AGTTCAA/\AG/VAGATGTTTCAGTTCCTTATAGTCGG/\AGCTAGGTATGGCAGTTGC/WWVGAAAAAT^^ 

GGAGAGAGGTTCTCTTATTATCCAGTTTTACTTGGTTT/\ATGATATTAGGATT/s^CGCCGATTATGATACC/V^ 

TTAG 

SPy1523 
Seq ID 64 

ATGGC/WW^GAT/W^GAGAAACAAAGTGATGAC/s^GCTCGTTTTGACAGAGTGGCAAAAGCGTAACATTGA^ 

AGAAGCAGC/!^GCTGAGGAAGAA/WWW\CTC/WVGAGAAATTATTGAGTGATAAAAAAGCGCAGCAGC/^ 

CTGAAGCGGTTGAGCTT/W\ACTGATGAGAAAACTGATAGTCAGG/W\TTGAGTCAGAAACGACGTCAAAACCTAAAAA^ 

AAAAAGTTAGACAACCCAAGGAAAAAAGCGCGACACAAATCGCTTTTCAAAAATCCTTGCCTGT^ 

CATGGCGGTGTCTATTTTTATGATCACTCCTTATAGCAAAAAG/WVGAGTTTTCTGT/KAGAGGAAACCATCy^ 

GAATT/W\TCAAAGCTAGCAAAGTCAAAGCATCTGACTATTGGTTAACGCTGTTAACTTCGCCTGGTCAGTATGAACGACCGAT^ 

TTCGTACTATTCGATGGGTGAAATCTGTACATCTCTCTTAGC/V^TTTCCTAATCACTTTCTATTTAACGTTAT^ 

TGGCTTATGCACAAGTTG/W\ACGGTTTTCAGCCTATTTTGGAGAATGGAAAACGTGTGGACAAGGTCAGGGCATCAGAACTAC 

CGAAATCTTTCTTGATTCTTAATTTAAAAGATGAGAAAGCGATCCAACAGTTAGTT/W^GCAATTAACGACA^^ 

GTC/V^GAATATC/iu\GTCAGTGTCTCTTGCAAATTCCAA/V\CGACAGCGGATTTACTACTTATTGAAATGGATGAC 

TTAGAGTACCGCAGTCACAAGTCAGATTGAAACTTCGCTATTATCAAAAATTGA/WWKAACCTTG 

ATGG/\AGTGGGAATTTATACTAC/V\CACAGGAGATTGAAAATCAACCTGAAGTTCCTCTTACGCCTG/V\CAA/^ 

AAGAAGGAGATAAGCCTGGTGAACATCAGGAACAGACAGACAATGATTCAGA/^ACGCCAGCAAATCAGAGTAGTCOTCAGCAA 

ACACCACCATCCCCAGAAACGGTCCTCGAACAGGCCCATGGCTAG 

SPy1536 
Seq ID 65 

ATGAAAAGACTTAAAAAAATCAAATGGTGGTTAGTGGGTCTGCTAGCTTTAATCTCTTTGTTGCTAGCGTTATTT^ 

TTATTATATTGAAATGCCTGGAGGCGCTTACGATATTCGGACTGTCTTACAAGTCAATGGCAAAG/\AGAC/W^CGAA/^GGAGC 

TTACCAGTTTGTTGCAGTGGGCATTAGTCGTGCCAGCCTCGCTGAGCTATTATATGCTTGGCTGACACCGTTTACTGAAATTAGT 

ACAGCAGAAGATAC/^CAGGCGGATACAGCGATGCTGATTTCCTTCGAATT/W\TC/V\.TTTTACATGGAAACATCACAAAATC 

CTATTTATCAAGCTTTATCCTTAGCTGGAAAACCAGTTACATTAGATTAT/WKGGCGTATATGTTT^ 

ACTTTTAAAGGAACGGTACACTTAGCAGATACTGTAACAGGTGT/W\TGGT/W^CAGTTTACTAGTTC 

ATGTTTCTCACCTAAAACTAGGGGATGAAGTTACGGTTCAGTTTACGAGTGATAATAAGCCTAAAAAAGGAGT^^ 

C/W^CTG/\AAAATGGGAAAAATGGGATTGGCATTGCCTTGACTGATCATACAAGTGTCAATTCAG/V^ 

ACTAAAGGAGTAGGAGGACCTAGTGCTGGTCTAATGTTTACTCTTGATATATATGATC/WS^T/^CTAAAG/^GATl^ 

GCCGTACAATTGCAGGTACAGGAACTATTGGC/V^GGATGGCG/^GTAGGAGATATTGGTGGTGCAGGTCTTAAAGTAGTTGCA 

GCAGCTGAAGCTGGTGCAGATATATTTTTTGTTCCGAATAATCCTGTTGATAAGGA^ 

TAATTACGAAG/SiAGCC/WSiCGGGCAGCCAAACGACT/Wi^GACCAAAATG/VAGATTGTTCCTGTTACGAC^ 

GGTTTATCTTCGCAAAT/^A 

SPy1564 
Seq ID 66 

ATGTTGGAACACAA^TTGATTTTATGGTAACTCTTGAAGTGAAAGAAGCAAATGCAAATGGTGATCCCTTP^ 

CTCGTACAGATGCCAAAGGATATGGTGTGATGAGTGATGTCTCCATTA/\ACGTAAGATTCGT/KATCGTTTGGAAGATATGGGGA 

AGTCTATTTTTGTGCAAGCTAATGAGCGTATTGAAGATGATTTTGGTTCACTGGAAAAACGCTTTTCGCAACAT^ 

ACACGTGACAAAGAAATTGAAGAAA/ivAGCAAATGCATTATGGTTTGATGTTGGTGGTTTTGGACAAGTTTT^^ 

ATCAATTGGGGTGGGTGGAGCAGTTTCCATCAGTATGGCTAAGTCCTTGGAGCCAATTGTCATTTCCAGCCTTC/WKTTACGCG 

TAGTAGCAATGGTATGG/V>vGCT/VKGAATAATAGTGGCCGCTCTTCTGATACGATGGGGACAA/W\GATTTTGTAGATTATG GTGTG 

TATGTACTT/WVGGTTCTATCAATGCTTATTTTGCTG/W\AGACTGGTTTTTGTCAGGAAGATGCTGAGGCT^^ 
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GGTTAGCTTGTTTG/WKATGATGCGTCGTCTGCACGTCCGGAAGGCTCTATGCGAGTTTGTGAAGTCTTTTGGT^ 

AAGCAAATTGGGA/KATGTTTC/^GTGCGCGTGTCTTTGACTTGTTAGAGTATCATCAATCAATAGAAGA^^^ 

GCTTATCAGATTCATCT/^AATC/VVG/WW\TTGGCT/W^TATGAAGGGAAAGGGTT/VSiCGCTTGA 

SPy1604 
Seq ID 67 

ATGGCAACTA/WWKGTACATATTATTTCACACAGTCACTGGGATGGCGAGTGGTAGATGGCTTACGAACAACACCACATGCGT 
CTGATT/^ACTT/V^TAGATGACCTGTTAGAAGTTTTTCAAACGGATCCTGATTTTCATAGT^ 

CCTAGATGATTATTTAAAAGTACGCCCCGAACGAGAACGTGAGATTAGACAAGGCATTGCTTGGGGAAAACTCCGTATCGGACG 
TTTCTATATCTTAGAGGACGATTTTTTGACCAGCAGTGAATCCAATGTGCGCAATATGCTGATTGGT^ 

TGGGGCGCTAGTGTGCCAGTTGGTTATTTTCCTGATACCTTTGGAAATATGGGACAAACACCACAGCTGATGTTAAAAGCCGGG 

CTACAAGCTGCTGCGTTTGGTCGTGGCATTCGTCCAAGTGGATTTAAGAATCAGGTGGATACCAGTGAAAAATACAGCTCCCAA 

TTCTCTGAAATCAGTTGGCAAGGCCCAG ATAAC AGTCGTATTCTTGGAGTCGTCTTGGCCAACTGGTACAGCAATGGCAATGAG 

ATCCCGACAACAGAAGCTGAGGCGCGTCTTTTTTGGGATAAAAAACTTGGTGATGCGGAACGCTTCGCCTCAACCAAGCACGTT 

GTGATGATGAAGGGGTGTGATCATCAACCCGTAGAAGTTGATGTCACCAAGGCAATCGGCTTAGCGAACCAACTCTATCCTGAC 

TACGAATTTGTGCATTCCTGGTTTGAAGATTACTTGGCTGATCTCGCAGATGATTTACGAGAGAACGTTTCAACGGTCCA^ 

AGATTACCAGTCAAGAAACCGATGGCTGGTATAGGCTAGCTAACACGGCTTGTGCTCGTATTTAGGTC/W\CAGGGTAATACCA 

GAGTCTCTCGCCAACTCGAAAACATCACCGAAGCGTTAGCAGCAATGGCTTATGAGGTAACAAGTACCTACCGTGACGAGCAAC 

TGCGTTACGCTTGGAAAACCCTCATGCAAAATCAGCGTCATGATTCTATGTGTGGTTGTAGTGTTGATAGCGTTCATCGGGAAAT 

GATGACGGGCTTTGAAAAAGCCTATGAAGTCGGACACTATTTAGCAAAAGAAGCTGCTAAGCAAATTGCTGACGCCATTGATAC 

GAGGGATTTTGGAATGGATAGCGAACCCTTCGTGTTATTTAATACCAGCGGCCATTCCAAAAGAAGTGTTGCTGAGCTCAGCCT 

GAGCTGG/yvWw^TATCATTTTGGGC/V\CGCTTTCCTAAAGAGGTTTACCAAG/^ 

TGTTTCCAAATTATTGACACTAGTGGACAAGTGAGACCCGAAGCAGAAATTTTAGGCACAAGCATGGCTTTTGAGTACGATTTGC 

CCAAGAGATCGTTCCGCGAACGTTATTTCGCCATGAAAGTGAGATTACGGGTACCAATAACTGTGCGAGCCATGTCTTGGAAAA 

GCTTAGCATTAAAGCTAGGAAATGAAAGAACTGGTTGAGAAACCGTTTCCCTCTACGATGACAGTAATGAGTGCCTTGAAAATGG 

GTTTCTAAAAGTTATGATAGAAACCGATGGTCGTCTAACCATCAGCGATAAACAATCTGGACTAATGTATCAAGACCTGTTGCGG 

TTTGAAGATTGTGGCGATATTGGAAATG/XATATATTTGTCGGGAGGGAAATCATGACCAAGCTTTGTATGCGGATGAAGG^^ 

TGAAGCTTAACATCATTAGCAACACCGCTCAAGTTGCTG/VVCTTG/W\TCCAGCAAACCTTTGCCAT^^ 

GCTCTTACAGGCTGAGATGGAGGCTGTCATTGACATCACAGAACGCCAAGCAAGACGTTCACAAGA/WVGGCTGAGCTAACCT 

TAACAACCCTTATCCGCATGGAG/WWVT/\ATCCTCGCCTCCAATTCACCACACGTTTTGATAACC/K^ 

GCGCGTCCTATTCCCAACGCACCTr/WVACAGACCATCATCTAGCTGACAGTATTTTTG/WVCTGTC/W^CGTCCA^ 

GATGCCACCTTTTGGAAG/\ATCCAAGT/VACCCACAGCACCAAG/VATGCTTTGTGAGTCTCTTTGATGGTG/W^ 

ATTGGTAACTATGGCCTCAACGAATATGAGATCTTACCAGATACCAACACCATTGCCATCACTCTCTTAGGTTCTGTTGGCG/W^ 

TGGGCGACTGGGGTTACTTCCCAACACCTGAAGCCCAGTGTCTTGGCAAACACAGCCTTTCTTATAGTT^ 

AGCAAACAC/WmGCCAGCTACTGGCGAGCTCAAG/\AGGCCAAGTCCCTGTTATTACCACACA/iw^C/W\CC/^ 

GATTAGCCGCAGAATATAGCTATTTGACGGGTAGAAACGACCAAGTTGCCCTCAGAGCTTTCA/^ACGTCGCTTAGCAGACAA 

GCCTTATCAGGGGCAGCTATAATCTGTCAAACGATA/W\CTTGTGACTTTAGCCTAAGCCTGCC/WSiCTAC/VVTGCCAAG 

TAATTTGTTAGAAAAAGAGAGCAAGC/WVGCACACCGAGCCAAGTTGGCA/\AGCGGAAATTTT/\ACTCTAGCT^ 

ATAA 

SPy1607 
Seq ID 68 

ATGAAAATCACTAAAATTGAA/KAGA/WW\CGGGTCTACCTTATCGAATTGGATAATGAGGAATCCCTTTATGTAAGAG 

TATTGTTCGGTTTATGTTGAGTAAAGATAAAGTGGTTGACAATGATCAGCTTGAAGACATGAAAGATTTTGCCCAAGTGTGCTAG 

GGCAAAAATTrAGCCCTTTATTTTCTTTGCTTTGAACAACGCAGGAACAAGC/\AGTTGGTGATTAGGT^ 

AAGAACACATTATTGCTGACATCATCAGTCAAGTCCAAGAAGAAGAATGGATAGACGAGACCAAATTGGCTGATAGGTACATTCG 
GGAAAATCAGTT/W^TGGTGAT/WVGGTCCCGAAGTGTT/W^CAA/W\TTATTAC/W^^ 

GTATGTTATCTGAAACTGACTTTAGCCAACTCGGTGAAA/\AGTAAGCCAAAAACTCTTTGACAAATATCAAGAAA/^ 

AAAGCCTTGAAAGATAAAATCACCCAAGCATTACTGACCAAAGGCTTTTCATACGATCTAGCTAAAGATAGCCTCAATCACCTT^ 

ATTTTGACCAAGATAATC/W\GAAATAGAAGATGTTGTTGACAAAGAATTAGACAAAC/VATATGGTA/W\CTCA 

GGTTATACCTTAAAGCAAAAGCTCTATCAGGCTCTCTACCGAAAAGGCTAC/»^CAGCGACGACATTAATTGCAAGTT/^G^^ 

ATTTATAG 

SPy1615 
Seq ID 69 

ATGATCTGTCTACTATGTC/\ACA/VKTTAGTCAAACACCAATAAGTATTACAGAAATCATCTTTTTAAGACGTATC^ 

TGTCAAC/^ATGTCAA/Wij^GCTTTCA/W^GATAGG/WWVGTGTTTGTGCGACATGTTGTGCAAACTCAGAT^ 

GAGATTGTCTAA/\ATGGGA/WKCAAAGGATACAATGT/W\TGATAGAAGCTTATATTGTTAT/\ATGCTGCTATG/^ 

AGTC/iATATAAGTTTCAAGGAGACTATTTATTAAGAAAAGTTTTTGGAGTAG/^ACTTGCCGATG^^ 

CTATATCCCAGTCCCGGTTCCTGTAAGTCCCGGTTGTTTTCGAG/W\GACAATTTAATCAAGTGAGCGCTA 

AATGTTAGCTACCTTTCTCTTTTTG/WV\GCTAGAT/\ATACTCACCAATGTTCCAGAACAA^^^ 

ATCTTATCGACTACTAA/s^GTATC/WKCATTCCTGATA/W^TCCTTATAGTAGATGATATTTATACTACTGGTAGT^ 

TCTTAG/W\ACAATTGGCTAAAGTAGCAAATAGTGACATT/W^AGTTTGTCAATTGCACGTT/V^ 

SPy1666 
Seq ID 70 

ATGAAATCCTTTTCTCTTACTTTTTCATTTCTA/\ACCTTTTGAAGTATGGTAC/ykTTAAAGTTATG 

ACCGTACTCGTTCACGAAACAGTGGACATGCTTGACATAAAGCCTGATGGGATTTATGTTGATGCGACGCTAGGTGGCTCAGGG 

CACTGAGCTTATTTGTTGTCGAAACTTGGTGAAGAAGGGCACCTCTATTGTTTTGACCAAGACCAAAAGGCTATTGAC^ 

AAGTTACCGTCAAATCTTATATTGACAAAGGAGAGGTAACTTTTATTAAAGATAATTTTAGACACCTGAAAGC^ 

CTTGGAGTTGATG/VAATTGATGGTATCTTATATGACGTTGGTGTTTCCAGCCCGGAATTGGATGAAAGAGAAGGAGGGTTTTCT^ 

ATAAACAAGATGCTCCATTGGATATGCGCATGGATGGTCAGTCGCTCTTAACAGCTTAGGAAGTGGTGAATACCTATCCATTCAA 

TGATTTGGTTAAGATTTTTTTCA/VATATGGTGAAGATAAATTCTCGAAGGAGATCGCTCGAAAAAT^ 

AGGCTATTGAGAGAACAACAGAGTTGGCAGAATTGATT/\AGGGAGCAAAGCCAGCTAAAGAGTTGAAGAAA/W\ 

GGTAAAGAGATTTTTG/^GCTATTGGGATTG/V\GTGAATGATG/\ATTGGGAGCGGCCGATGAATGTATTGAGGAGGGTATGGAAT 

TATTAGGCCTTGATGGTCGTATGTCAGTTATTAGGTTCCATTCTCTGGAAGATCGGCTAAGCAAGCAGTTGTTTA/^AGAAGCTAG 
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TACGGTGGATGTGCCA/W^GGGCTTCCTCTAATTCCTGAAGATATGAAACCTAAGTTTGAACTTGTTTCACGTAAG 
CGTAGTCATTGAGAGTT/Su^CAGGTAATAAAAGGGCACACTCAGCCAAGGTAGGTGTTGCCAAAA/WKTTCGG/^ 

SPy1727 
Seq ID 71 

GTGAC/\ACGACGG/Su\CAAG/V\CTTACCTTGACTCCGTTACGTGGGAAAAGTGGC/WKGCTTAT/WSiGGCACT^^ 
GAATGTGTCTTTATAAAATTAAATACGACCCCTATTCTACCTGCGTTAGCAAAAGAACAGATTGCGCCACAGT^^ 
/^CGCATGGGCAATGGTGATATGATGAGTGCCCAAG/s^TGGCTT/VkCGGCCGTACATTGACCAAAGAAGATATGAACAGTAAG 

CAAATCATTCATATTCTATTGCGCGTTGACAAATCTAAAAAATTAGTCAATC/ykCTGCTTCAGCTCAAT^^ 
TACGATTTATTGGTTGATTTTGAGCAA^ATGCACCCTTGGAAATTCAGCAAA^ 

ACGGAGCTTACGAGAGTTC/W\TCAGAAGTAGCAACGATTGTGCATGGAGATATTAAAGATAGGAATTGGGTGATTACTACTAG^ 

GGTATGATTTTTTTAGTAGATTGGGATTCTGTTCGTCTAACTGATCGGATGTATGATGTTGGTTACGTGTTGAGCCAGTATATTCC 

ACGGTCTCGTTGGTGAGAATGGCTGTCTTATTATGGGTATAAAAATAATGAGAAGGTTATGCAAAAAATTATT^ 

TTTCTCACCTGACACAAATTCTCAAGTGTTTTGACAAGCGTGACATGGAGCATGTGAATGAGGAGATTTATGCCCTCAGA^ 

TAGAGAAATATTTAGAAAGAAATAA 

SPy1785 
Seq ID 72 

ATGATATTAACAGCTCCTATGTCCAACTTAAAGGGATTTGGACCAA/iATCAGCAG/UWVKTTTCAGAAAT^^ 
AGAAGATTTACTGCTTTATTATCCGTTTCGCTATGAAGATTTTAAATCAAAATCTGTTTTTGATTTAGTGGA^ 
TCATTACAGGCTTAGTCGTTACTCCAGCTAATGTACAATATTATGGTTTTAAACGTAACCGTTTAAGTTTG^^ 
GAAGCTGTCTTAAATGTTAGTTTTTTTAATCAACCGTATTTAGCTGATAAAATAGAACTTGGTCAAGAG^^ 

ATGGGATGCCACTAAATGGGCTATTACTGGGATGAAGGTTTTAGCTCAAGTTGAAGATGACATGCAACCTGTTTATGGCGTAGC 

TCAGGGAATTTCAGAGTCTACTTTGATTAAAGCTATTAAGTCAGCTTTTGAAATCGATGCGCATTTGGAATTGAAGG/W^ 

GAGCTACTTTATTGGAAAAATACCGATTGATGGGTCGTAGTCAGGCTTGTTTAGCTATGCATTTCCCAAAAGATATCACAGAGTA 

TAAGCAAGCGCTCCGTCGGATTAAATTTGAAGAATTATTTTACTTTCAAATGAACGTTCAAGTTTTGAAAGCGGAA/^ 

AAAGAAATGGTTTGCGTATTCTTTATAGT/WVCGTGCTATGGAGAGAAAGATTTGGTGTTTAGCTTTTAT^^ 

AAGCGGTCTTTAGATGACATATTATCTGATATGTCATCGGGAGCTCATATG/\ATCGTTTATTGC/\AGGAGATGTAGGATCAGG/^A 

AGACAGTCATTGCTGGTCTATCAATGTATGCAGCTTATACAGCAGGTTTTCAATCGGCTTTGATGGTTCCA^ 

TG/s^C/\ACACTACATTAGTCTGC/^AGAGTTATTTCCAGATTTATCAATCGCTATATTAACTTCGGGTATG^ 

CGTAGGGTTTTAGCAGCTATTGC/WVTGGCTCGGTTGATATGATTGTAGGAACTCATGCTCTTATCCAAGACTCGGT^^ 

ATAAACTGGGGCTTGTCATTACAGACGAGCAAGATCGTTTTGGTGTT/W^CAGCGTAGAATTTTCCGTGAA^ 

CTGATGTTTTAATGATGACAGCCACCCCAATTCCCCGAACTCTAGCAATCACAGCTTTTGGAG/WSiTGGATGTT^^ 

TG/su^TTACCTGCCGGTCGTAAACCTATTATGACACGCTGGGTGAAACACGAGCAGCTAGGTACTGTGTTGGAATGGGTAAAAG 

GTGAATTGCAAAAAGATGCTC/kAGTGTATGTCATTTCACCGTTGATTG/VAGAATCAG/\AGCTTTAGAT^ 

ATTGCATGCTG/sATTATCTACTTATTTTGAAGGAATTGCT/V\GGTTGCTGTTGTACATGGACGT^^ 

CTATAATGCAAGATTTCAAGGATAAAAAAAGTCATATTTTAGTATCCACAACAGTTATTGAAGTAGGGGTAAA^ 

ACGATCATGATTATTATGGATGCCGATCGTTTTGGATTAAGTCAGTTACATCAACTTCGTGGGCGTGTTGGTCGTGGATATAAAC 

AATGATACGCTGTTTTAGTGGCTAATCCCAAAACTGATTCGGGGAAAAAAGGAATGACAATCATGACAGAAAGGACAGATGGT^ 

CGTTTTAGCTGAGTCGGATTTAAAAATGCGTGGTTCTGGTGAAATCTTTGGTACTCGTCAGTCTGG/VATTCCAGA^ 

GCTGATATCGTTGAGGATTATCCTATTTTAG/V\G/i^GCACGCAAAGTTTCTGCAGCGATTGTTTCTGATGGTAACTC 

AAAAACAGTGGCAATTAGTGGCAC/W\ATATTAGAAAAAAAGAAGTTTATGATTAA 

SPy1798 
Seq ID 73 

ATG/WWWi^TCAGC/Wi^TGTGCGTTTGTGGC/\ATATCTGCCCTTGTTCTCATTCAGGCTACTCAAACTGTAy^ 

CTTTAGTTCAGTCACAAGTCGTGACAACAGTAGCTTT/V\CTGAGGAT/^TCGACTTTTAGTTGAAGAGATAGGCCCTTA^ 

TCAATCAGCTGGAAAAGAGTATTATAAACATATTGAAAAGATTATTGTTGATAATGATGTCTATGAAAAAAGCCTGGAGGG 

CGAACCTTTGATATTAACTAGCAAGGGATTAAGATCAATGCTGACCTTATTAAAGACGGTAAGCATGAATTGACTATTGTTAAT^ 

AAAAGATGGTGATATCCT/W\TTACCrrTATT/WW\GGGCGATAAAGTGACCTTTAT^ 

ATCAGGATTCATTAAAAAAAGATGTGCTCAGTGATAAAACAGTGCCACAAAACCAAGGCACACAAAAAGTTGTTAAATCTGGG^ 

AAATACTGCTAACTTGTCATTAAT/V\CA/WVTTGAGTCAAGAAGATGGTGCAATTTTATTTCCAG/W\TTG 

CAAACAGATAAAAGCATTGACTCAGGAAATCACAAAGGTTACAGTCAATGGTACAGTTTATAAAGATCTTATTTCAGAT^ 

AAGATACT/W\TGGGTGGGTCTCGAATATGACAGGGCTTCATCTTGGAACAAAAGGTTTCAAAGATGGAGAAAATAGAATCGTGAT 

ATCCTCAAAAGGATTTG/i^GACGTTACTATTACCGTTACC/ViiGAAAGATGGTCAAATCCAT^ 

ATGTGACTGCTGAAGAGAGACAATC/\AC/WVGTTGGATGTCAGCACTTTGGAAAAAGCTATCAAAGAAGCGGATGCGAT^^ 

CT/W^GA/V^GC/\ACAAAGACGCGGTCAAAGATCTGGCTGAGAAACTTCAAGTCATC/\AGGATTGTTACAAAGA^ 

GT/VVGGTACTCGCCGATACTCATCGACTGTTAAAAGATACCATCGAGTCTTATC/\AGCAGGTGAGGTTTCTATT/VVC/^^ 

AGAAGG/V^CGTATACGGT/WkCTTTAAAGCTAATAAAGA/W\CTCAG/\AGAGTCCTCCATGCTTCAAGGTGCT^ 

GCCAAATTAGTGGTTAAAGCAGATGGTACAATGGAAATTTCCATGCTTAATACTGCTTTGGG^^ 

TG/WKGC/W\GGGACCTACCCAGCAGCAGTGCGTAAACAAGTTGGCCAAAAAGATATC/VATGGTAGCTATATTCGA^ 

TACCATGCCTATTGATGATTTGGAT/W^TTACAC/WS^GGTGCTGTTTTGGTATCAGCCATGGGAGGTCAAGA/^ 

GACTATGACAAATACACGAAACTTGACATGACCTTTAGTAAGACCGTTACCAAAGGCTGGAGTGGTTATCAGGTAGAAACTGAT 

GATAAAGAAA/V^GGGGTTGGGACTGAACGTCTTG/WWVGTTTTAGTT/W^CTTGGCAAAGAT^ 

CAAAAAGGGAATTAGAACAGATTCGAGGCGAGTTGCGTCTAGAGCATTAGGAGTTAACTGATATTTCTTTATTGA^ 

/yV\TATTACAGAACTACATCTGGATGGAAACCAAATTACGGAAATTCCAAAAGAGTTATTTAGTC^^ 

TTAACTTAAGAAGTAATGATTTAACTTATCTAGACAAAGATACATTTAAAAGCAATGCTCAATTAAGAG 

ACTTTATTCACTCTCTTGAAGGAGGACTATTCCAGTCGCTTCATCACCTGGAGCAACTTGATCTTTCCAAGAATCGTATTGGCC^ 

ACTTTGTGATAACCCATTTGAAGGATTGTCTCGTCTGACTTCATTAGGTTTCGCAGAAAATAGTCTTGAGGAGATA 

GGGCTAGAGCCTCTAACATCACTT/y^TTTTATCGACTTATGTCAAAATAATTTAGCACTACTGCCAAAAACAATAGAAA^ 

CGCCTTAAGCACTATTGTGGGAAGTAGAAATCATATTACTCGTATTGATAATATTTCATTTAAAAAT 

CGATTTATGAACTAATGAAATTTCAAATCTTCCAAATGGTATATTTAAACAGAATAACCAATTAACA^ 

TTGCTTAGTGAGGTTG/^GAATCAGTATTTCCAGATGTTGAAAGGCTTAATTTAGATGTGAAGTTG/^TCAGATAA^ 

TGGAA/V\GTAAGAGCTGTTATCGGACAACAGA/V^CTGAGTCCACAA/WVCATATTGCAA/\ACTTGAAGGTTGG^^ 

AAAATAAAATATGATG/SiAGGTTTGAGTGTTTTAGATTTGTATTATTGGGAGCAAAA/^u^GAAAT^^ 

GTCTGTTGAAG/\ATATCAACAATTGTTAC/\AGAAAAAGGTTCAGATAGGGTTTGTTTACTTA^^ 
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TTGTGATTCAGTTGC/WWW\AGCTTCCAATGGACAGTATGTGACGGTTGACGAAAAGCTTCTCTCAAATGATCCG 

GTTAACGGGAGAGTTTTCTTTAA/^AGATCCAGGTACATATCGGATTCGGAAAGCTTTAATAACA^ 

AAGATATCTATTTGACATCTAATGATATCCTTGTGGCGAAAGGACCACATTCACATCAGAAAGATTTAGTTGAGAACGG 

AGCATTA/\ATCAA/WKC/\ATTGCGTGATGGTATTTACTATTTA/KATGCCAGCATGTTAA^^ 

CAAAC/^AGGCTATTAATCATCGAGTGACTTTGGTAGTTAAAAAAGGTGTTTCCTATTTAGA^ 

GGT/W!w^TGTTAGGCTACCTTGGTG/^TTGAGCTATrrCGTAGATGGTTACC/WVGAGATTTAG 

CAAAAAAGGCAGAGGTTGTGTCTTATTTCACGGATGTAACTGGCCTACCATTGGCAGATCGTTATGGAAAAAACT^^ 

GCTGCGTATGAAATTGATTG/\ACAAGCGAAGAAAGACGGACTTGTGCCATTAGAGGTCTTTGTGCCTATCATGGATGCCA 

AAAAGGGTCTGGCCTTCAAACGGTTTTTATGGGTTTAGAGTGGGCAAGGCTTAC/i^CAGAGAAGGCAA^ 

TAATAATCCACAAGAAAATAGCCATCTAAGTTCAACAGATCAGTTGAAAGGACCTC/WU^TCGTCAAGAAGAA^^^ 

AGTCGTGCTTCAGCAGCTACTGGTATTGCTAAGTTAACTGATGTCCTGGCTAAAAAAGCAACGGGCC/\ATCA/^CTCAAG/^ 

CTAAGACAGATGATACTGATAAGGCAGAGAAATTGAAGCAGTTAGTGCGTGACCATCAAACATCAATTGAAGGTAAAACAGCAA 

AAGATACTAAGAGTAAAAAATGTGATAAGAAACATGGTTGCAATCAACAATCAAATGGTGAAGAAAGTAGCTCTC 

ATTGCAGGTCTATCTAGCTTTATGATCGTAGCTCTGGGATTCATTATTGGTCGAAAGACATTATTTAAATA^ 

SPy1801 
Seq ID 74 

ATGAATAAAAACAAACTATTAAGAGTTGCCATGCTACTAAGTCTCTTAGCCCCGACAGCAGAAAGCATGACAGTGCTGGCTCAA 

GATGTAATGCTTGAGACGCATAAAGCAACTACAAATGAAACCAGTGATTCTTCTTCAAAAGAGGAAAATAATAAA^ 

CTACAACATCAGATAAAACTGACCAAGGTCCCCTTGATGCTTCTGCAGAAACAAACTCTAATAGTCTTGTTAACGCGGATGATAA 

AAAAAGAAGGGATTCTAGTCAGTGTGCTATAGGCTCTTCGGACAACAAGGCAGAAGCAGAAAACCAGGTAGATGATAAATCAAG 

TGATCATTCGAAATCAACTGATCATTGGAAACGAACTGACCAGCCCAAACCATCACCATCTAAAGTTGATACGGCACCTGGTTCT 

TCATTGTCG/W^CAACTGCCAGAGGCAAGAAGTCCTATTCAGTCGTTGTCCCGTTACGTATCAGATTTAGATTTGAGTGAGATAG 

ATATCCCTTCTGTCAACACATACGCGGCATATGTAGAGCATTGGAGTGGTAAAAATGCGTATACCCACCATCTTTTATCTCGCCG 

TTATGGTATTAAAGCTGACCAGATTGATAGTTACTTAAAATCAACAGGCATTGCCTATGACAGCACACGTATTAATGGTGAGAAG 

CTATTGCAATGGGAAAAGAAAAGTGGGCTGGATGTTCGAGCTATCGTAGCTATTGCGATGTCTGAGAGTTCTTTAGGAACTCAA 

GGGATTGGAAGTTTGGTTGGAGGTAATATGTTTGGCTATGCAGCTTTTGATGTAGATCGGACTG/V\GCAAGTAAGT^ 

ATAGTGCTATTGTC/WV»iTGACACAAGACACCATTATTAAAAAC/WWS^TAGC/^TTT^ 

TTTTCACGAGGTC/\ATTAAACTTTGC/V^GTGACGGGGGTGTTTATTTTACTGATACTACTC^ 

TTATGG/VAGACCTGGATAAGTGGATTGATGACCATGGTGGCACACCAGCCATTCCAGCCG/5u\TTGAAAGTGCAGTCATCAGCTA 

GTTTTGCATCTGTGCCAGCAGGTTATAAGCTCTCTAAGAGTTATGATGTCTTGGGTTATCAAGCTTCGAGTTATGCTT 

ATGCACTTGGTATGTGTAT/V^TCGCGCC/W\GAATTGGGTTACC/\ATTTGATCCTTTTATGGGAAA^^ 

AAAGTAGGGTATGCCCTTTCAAAGACTCCAAAAGTAGGTTATGCTATTTCATTTGCACCAGGGCAAGCGGGCGCTGATGG^^ 

TATGGCCACGTATCAATTGTAGAAGATGTTAG/WSiAGATGGGTCTATTCTTATTTCAGAGTCT/V\GTGTA^ 

TTTCTTATCGTACCTTTACAGCTCAGCAGGCTGAACAGCT/sACATATGTTATTGGC/^GAGT/^^^ 

SPy1813 
Seq ID 75 

ATGGAT/WVCATTTGTTGGTAAAAAGAACACTAGGGTGTGTTTGTGCTGCAACGTTGATGGGAGCTGCCTTAGCGACGGACCAT 

GATTCAGTCAATACTGTAAAAGGGGAGGAGAAGACTGTTGAGGTTCAGAAAGGATTACCTTCTATCGATAGCTTGCATTATCTGT 

CAGAGAATAGCAAAAAAGAATTTAAAGAAG/V\CTCTCAAAAGCGGGGCAAGAATCTCAAAAGGTCA^ 

CTCAGCAGGCAGATAAAC/\AGGTCAAGAACTTGCCAAAATGAAAATTCCTGAGAAAATACCGATG/W^.CCGTTACATGGT^ 

CTACGGTGGTTACTTTAGAACTTGGCATGACA/W^CATCAGATCCAACAGAAA/\AGACAAAGTTAACTCGATGGGAGAGCTT 

TAAAGAAGTAGATCTAGGCTTTATTTTCCAGGATTGGACAAAAGATTATAGCCTTTTTTGG/W\GAA1^ 

CCAAAGTTAAAGAAGG/VAGGGACACGTGTCATTCGTACCATTCCATGGCGTTTCCTAGCTGGGGGTGATAACAGTGGTATTGCA 
GAAGATACCAGTAAATACCCAAATACACCAGAGGGAAATAAAGCTTTAGCCAAAGCTATTGTTGATGAATATGT^ 
CCTTGATGGGTTAGATGTGGATGTTGAACATGATAGTATTCCAAAAGTTGAGA/WW\GAAGATACAGGAGGCGTAG/VVCGCT^ 
TATTCAAGTGTTTGAAGAAATTGGGAAATTAATTGGACCAAAAGGTGTTGATAAATCGCGGTTATTTAT^^ 

TGGCTGATAAAAACCCATTGATTGAGCGAGGAGCTGCTTATATTAATTTATTACTGGTACAGGTCTATGGTTCAGAAGGAGAGAA 

AGGTGGTTGGGAGCGTGTTTCT/KATCGACCTGAAAAAACAATGGAAG/^ACGATGGCAAGGTTATAGCAAGTATATTCGT 

ACAATACATGATTGGTTTTTCTTTCTATGAGGAAAATGCTCAAGAAGGGAATCTTTGGTATGATATT/^TTCTCGG/\AGG 

GACAAAGCAAATGGAATTAACAGTGACAT/VVCTGGAACGCGTGCCGAACGGTATGCAAGGTGGCAACCTAAGACAGGTGGGGT 

TAAGGGAGGTATCTTCTCCTACGCTATTGACCGAGATGGTGTAGCTCATCAACCT/WWV^TATGCTAAACAGAAAGAGTTTAAG 

GACGCAACTGAT/^ACATCTTCCACTCAGATTATAGTGTCTCC/V\GGCATTAAAGAGAGTTATGCTAAAAGATAAGTCGTATGATG 

TGATTGATGAGAAAGATTTCCGAGATAAGGCTTTGCGAG/\AGCTGTGATGGCGCAGGTTGGAACCAG/W\AGGTGATTTGG^ 

CGTTTCAATGGCAGATTACGATTGGAT/VKTCCAGCGATTCAAAGTTTAGAAGGTCTAAAT/WKTTT/\^^ 

CTTGATTGGCTTATCTCGCATTAC/WJ^GCTCGAGGGTTCTGTTTTACCCGCT/VN.TATG/\AGGCAGGC/^ 

GTTCTTGAAAGCTAT/WW^GGATAAC/W\GAAG/\ACCTGCTAGTATCCCACCAGTATCTTTG/\AGGT^ 

TG/W^G/KATTAGATTTGTCAGGTTTTGACCGTGAAACCTTGGCTGGTCTTGATGCCGCTACTGTAAG 

TATTTCTGGC/VkCAAACTTGATTTGGGTCCAGGAACAGAAAATCGACA/VATTTTTGAT^^ 

TTGG/\AGGAATGAACAAACAGTG/WVTTTGACAAGCAA/WVCC/KACTGGGCATTACCCAGATACCTATGGG/^^^ 

GCTTACCAGTGGCAAATGAAAAAGTTGATTTGCAAAGCCAGCTTTTGTTTGGGACTGTGACAA^ 

CGAAGGAGAGTATAAGGCTTACC/WWrCATAAAATTGCTGGAGGTAGCTTTGTTGATTCA/^ACTATCATT 

TTTCTTATGAGAACTATACGGTTAAAGTAAGTGATTCCAGATTGGGAACCACTACTGACAAAACGCTAGCAACTGATAAAG^ 

GACCTAT/\AGGTTGACTTCTTTAGCCCAGCAGATAAGAC/WiyQiGGTGTTCATACTGGTAAAGTGATTGT^^ 

ATGATGGTTAATTTGGCAGAAGGCGCAACAGTTATTGGAGGAAGTGCTGATCCTGTAAATGC/iu^GAAAGGTA 

CTGGGCAGTGAGACTGATAATATCTCTTTAGGATGGGATTCTAAGCAAAGTATTATATTTAAATTG/W*^G/\AGATGGATT^ 

GCATTGGCGTTTCTTCAATGATTCAGCCCGAAATCCTGAGACAACCAATAAACCTATTCAGGAAGC/V^GTCTACAAAT^^ 

ATGAAAGATTATAATCTAGATAATTTGTTGGAAAATCCCAATA/\ATTTGATGATGAAAAATAT^ 

GCACAAGGAGAGAGAGCTACTGCATTCAGT/V\TACATTAAATAATATTACTAGTAAATATTGGCGAGTTGTCTTTGATAGT/^ 

GAGATAGATATAGTTCGCCAGTAGTCCGTGAACTCGAAATTTTAGGTTATGCGTTACCTAACGCCGAGACTATCATGAAAAC^ 

AACTACTGCTAAAGAGTTATCTGAACAAAAAGATAAGTTTTCTCAAAAGATGGTTGATGAGTTAAA^ 

/W\CTTGTTTGAAGAGT/\AGATTTTTGATGTAACTGCTATTAATGCTAATGCTGGAGTTTTG 

CTGCTA/WW!0\TAA 



SPy1821 
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Seq ID 76 

ATGATTG/Su^GC/^GTAAGCTTAAAGCAGGTATGACATTTGAAGCAGAAGGAAAATTAATCCGTGTCCTTGAAGCT^ 

AAACCAGGTA/\AGG/W^CACTATCATGCGTATGAAACTACGTGATGTGCGTACAGGTTCTACTTTTGACACAACTTACGGCCCA 

GATG/WW\TTTGAGC/\AGCCATCATTGAAACTGTCCCAGCACAATACCTATAGAAAATGGATGACACTGCTT 

CTGAGACTTATGATCAGTACGAAATTCCAGTTGCT/SiACGTTGAGCAAGAATTGCTTTACATTCTTGA^ 

CC/\ATTTTATGGAAGTGAAGTGATTGGGGTAACGGTTCGAACAACTGTTG/\ATTGACCGTTGGG^ 

GGAGCGACAGTGACGGGTTCAGGGAAACCTGC/Su^CTCTTGAGACAGGACTrGTTGTTAACGTTCCAGACTTTATCGAAGCTGG 
CC/W^CTAATGATT/Su^CACTGCAGAAGGTACTTACGTTTCTCGTGCTTAA 

SPy1916 
Seq ID 77 

ATGACTAAAAGATTAGCTAAAGATTTTATTTTTGGTGGTGCTACAGCTGCTTACCAGGGTGAAGGGGCTACGCACACAG^^ 

AAGGACCAGTAGCTTGGGATAAATACTTAGAAGACAACTATTGGTACACAGCTGAGGCAGCAAGTGATTTTTATAATCGTTAGCG 

TGTCGATTTGAAACTTAGTGAAGAATTTGGTGTC/^CGGCATGCGTATCTCTATTGCCTGGTCTCGTATTTTTCCAAGAGGAA^ 

GGAGAAGTTAACCCTAAAGGAGTAGAATACTACCACAATCTTTTTGCAGAGTGTCATAAGCGTCATGTTGAGCCTTTTGTT^ 

TTCACCATTTTGATACCCCAGAAGCTCTCCACTCGGATGGTGACTTCCTCAATCGTGAGAACATTGAACATTTTGTAAATTATGC 

AG/^TTTTGTTTTAAAGAATTCTCAGAAGTTAACTATTGGACAACATTTAACGAAATTGGGCCTATTGGTGATGGC 

TTGGTAAATTCCGTCCAGGTATGCAATATGATGTTGCTAAAGTTTTCCAATCAGACCATAAGATGATGGTCTCTCATGCTCGTGCA 

GTCAAAGTCTTTAAAGATAGTGGTTATTCAGGTGAAATTGGTGTTGTCCATGCAGTTCCAACTAAGTATCCATTTGA 

TGCTGATGATGTTAGAGCAGCTGAACTTGAAGATATCATCCATAATAAATTTATCGTTGATGCTACTTATCTTGGT.^GTATTCAG 

ATAAAACAATGGAAGGTGTTAACCATATCCTTGAGGTGAATGGCGGTGAACTTGATCTTCGCGAAGAAGATTTTGCCGCACTAG 

ACGCCGCAAAAGATTTGAATGATTTCCTTGGTATTAACTACTATATGAGTGATTGGATGCAAGCTTTTGATGGTGAGACTG 

CATTCACAATGGCAAGGGTGAAAAAGGCAGCTCT/^AATACGAAATCAAGGGTGTTGGTCGAAGAAAAGCACCCGTTGATGT^ 

AAAAACGGACTGGGACTGGATTATCTTCCCACAAGGCTTATATGATCAAATCATGCGTGTC/WKGCCGATTATCCTAATTACAAG 

AA/V\TTTACATTACAGAGAATGGTCTTGGCTACAAAGATGAGTTTGTAGATAATACTGTCTATGATGGTGGACGTATCGATTATGT 

GAAAAAACACTTAGAAGTTATTTCTGATGCTATTTCTGATGGTGCAAATGTTAAAGGATACTTTATGTGGTCACTGATGGATGTCT 

TTTCATGGTC/W^TGGCTATG/WWVCGTTACGGTGTCTTCTATGTTGATTTTGAAACTCAAGAACGTTATGGT^ 

CTACTGGTAT/WW^GTAGCAG/WVCTCAAGTGATTGAATGA 

SPy1972 
Seq ID 78 

ATGAAA/\AG/W^GTC/VKCC/VkGGATCAAAGCGCTATCAATATCTGTT/WWVAGTGGGGGATAGGTT^ 

GGACTGTCGTGTTAGGGTGCACCCCTAGTATCTTAACACATC/VJiGTTGCTGCTAAAACCATTGTTGGACTAGCCCGCGATGAAG 

CTCAACAAGGAGATGGC/^ATGCTAAATCTGGTGATGGTCTTCAATCGTCTAGCAAGGAGGCAAAACCAGTTTTAGACAGCT^ 

CAGCTAATCCTGCTAGTATTGCTGAGCATCATTTGCGTATGCATTTTAAAACATTGCCAGCTGGTGAGTCGCT^^ 

GACTTTGGGTGTGGGGAGATGTGGATCAACCTTC/WSiGGATTGGCCAAATGGTGCTATCACCATGAC/WSwfiiGCG/\^^ 

GACTATGGCTATTATCTAGATGTGCCACTAGCAGCT/W^CACCGCCAGG/SbfliGTGTCTTATCTCATT/^TAATAAAGCTGGAGAGA 

ATCTTTCAAAGGACCAGCACATCTCGCTTCTCACGCCAAAAATGAATGAAGTTTGGATAGACGAG/VKTTACCATGCGCAGGGTT 

ATGGACCTTTGAAAAAAGGTTACCTTCG/\ATC/KACTACCACAATG/V\TCGGGACACTACGATAACTTAGCTGTGTGGAGCTT^ 

AGATGTCAAAACCCCAACGAGGGACTGGCCAAATGGAGTTGACTTGTCACATAAAGGGCATTATGGAGCTTATGTTGATGTCCC 

CTT/WVAGAAGGAGCTAACGAAATCGGATTTTTAATCCTTGATAAAAGTAAGAGAGGAGATGC 

TATCTATTTAAAGAGTTAGACAATCATACTCAGG 1 1 I i I GTCAAAGACACTGACCCAAAAGTTTACAACAATCCTTATTATATTGAT 

GAGGTTAGTCTCAAAGGAGCTGAAC/W\GCACGCC/\AATGAGATTAAAGCCATTTTTACGACCTTAGATGGGCTTGATGAAGAT 

GCGGTGAAACAAAACATCAAGATCACTGACAAAGCAGGGAAAAGTGTTGCAATTGATGAGTTGACACTTGAGAGGGATAAGTCT 

GT/VATGACATTAAAGGGTGATTTTAAGGGGCAAGGTGCAGTCTAGACGGTTACATTTGGAGAAGTTAGCC/\AGTCGCTGGCGAA 

TCCTGGCAATTAAAAGATAAAGTCTATGCTTACGATGGTGAACTTGGAGGTACCCTAGGTAAGGATGGTTCTGTTGATT^ 

TATGGTCTCCAAGTGCTGATAGTGTT/^GGTTGTGGTTTATGATAAACAAGATCAGACAAGGGTGGTTGGTCAAGCTGATTTGAC 

CAAGTCGGACAAGGGTGTTTGGAGAGCTCATCTAACTTCTGACAGTGTCAAGGGCATTAGTGATTACACAGGCTACTATTACCT 

TTATGAAATCAGGCGCGGTCAGGAAAAAGTCATGGTTTTGGATCCTTACGCCAAATCTCTCGCTGCCTGGAATGATGCGACTGC 

TACTGATGACATCAAAACAGGAAAAGGTGGCTTTATTGATCCAAGCAAACTAGGAGGAAGAGGCCTTGATTTTGCCAAAA^ 

/V\GTTTAAAAAGCGTG/^GACGGTATTATCTATGAAGCACATGTGCGAGATTTTACGTCAGATAAGGCTCTAGAAGGCAAGTT 

CACACCCTTTTGGGACTTTTTCAGCTTTCGTTGAACAGCTAGACTATCTCAAAGACTTGGGGGTTACCCACGTTCAATTG 

GGTTTTGAGTTATTTTTATGCCAATGAGCTGGACAAGAGCGGCTCAACAGCCTACACGTCTTCAGACAATAATTACAACTGGGGT 

TATGACCCACAACACTACTTTGGCCTTTCTGGCATGTATTCGGCAAATGCT/SvATGACCCTGCTTTACGTATGGCAGAGGTTAAAA 

ACCTTGTC/^TGAGATTCAC/W\CGTGGTATGGGTGTTATTTTTGATGTGGTTTAT/^CCAGAGGGCTAGAACCTATCTCT^ 

AGATTTGGAACGG/s^CTAGTATGATTTTATGAATGGTGATGGTACAGCTAGAGAGAGTTTTGGCGGAGGTCGTCTAGGAACGAC 

ACATGCCATGAGTCGTCGTATCTTGGTGGATTGGATTACTTATCTGACTCGTGAATTGAAGGTAGATGGTTTTCGT^ 

ATGGGTGACCATGATGGGGCAGCTATTGAGCAAGCCTTT/V^GGCAGCCAAAGCCATT/\ATCC/W\TACCATTATGAT^ 

GGCTGGCGTACCTACC/\AGGTGATGAGGGG/WWW^GA/^ATTGCGGCAGATCAAGATTGGATGAAAGCAACCAATACGGTCGG 

TGTTTTCTCTGATGATATCAG/WVTACCCTC/V^GTCAGGTTTTCCAAATGAAGGCACAGCAGCCT^ 

AATCTAG/\AGGTTTATTC/W\ACGATCAAAGCAGAGCCTGGTAACTTTG/V\GCAGATGCCCCAGGAGATGTAGTGC^^ 

GCAGCCCATGACAACCTGACCTrACATGATGTCATTGCC/W\.TCCATC/\AT/\AGGATCCT/WkGTGGCTGAAG/\AGAGATT^ 

AAGCGTATTCGTCTAGGAAATACCATGATTTT/Su^CTGCTC/V\GGGACTGCCTTTATCCATTCTGGTCAGG/^^ 

AGCAGCTTCT/W^TCCCGACTAC/iiAGACAAAGGCGTCTGATGACAAGGTGCCAAATAAGGCGACTCTGATTGATGCTGTAGCG 

C/\ATACCCTTACTTCATCCACGATTCTTATGATTCGTCTGATGCGGTCAATCATTTTGACTGGGCAAAGGC/yKCAGAT^ 

CTCACCCGATTAGCAACCAAACAA/WKGCCTATACACAGGGACTAATTGCGTTGGGTCGCTCAACAGATGCCTTTACAAAAG 

CCAAAGCTGAGGTAGATCGGGATGTGACCTTGATCACCCAAGCAGGAGAAGATGGTATTCAACAAGAGGACCTCATCATGGGT 

TACCAAACAGTGGCATCAAATGGAGATCGCTATGCTGTCTTTGTCAATGGAGACAACAAGACCCGCAAGGTAGTTrrACCTC/i^ 

GCCTACCGCTATTTGCTAGGAGCCCAAGTGCTTGTTGATGCTGAGCAAGCTGGTGTTACTGCCATTGCTAAGCCTAAGGGAGT 

CCAGTTTACCAAAGAAGGCTTGACTATTGAAGGCCTAACTGCCGTGGTCCTCAAAGTATCCTC/yW^CGGCTAAT^ 

GCAAAAGAGTCAGACAGACAATCATCAAACCAAAACACCAGATGGCTCAAAAGACCTAGACAAATCATTAATGACTAGACCAAA 

AAGAGCTAAAACAAACCAAAAGCTCCCAAAAACGGGTGAAGCCTCCTCAAAAGGCTTATTAGCAGCTGGAATAGCTCTGCT^^ 

ATTGGCTATTAGGCTGTTGATG/VKGCGCGAAA/^AGATTAG 



SPy1979 
Seq ID 79 
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ATG/WWS^TTACTTATCTATTGGAGTGATTGCACTGCTGTTTGCATTAACATTTGG/V\CAGTCAAGTCGGTCC^ 

GGTAT GGATGG CTAC CAGACC GTCCACCTATGAATAACAGCCAGTTAGTTGTTAGTATGGCCGGTATCGTTGAAGGTACCGATA 

AAAAAGTTTTTATAAATTTTTTTG/W^TCGATCTAACATCACAACCTGC^ 

ATC/W»iACCATTTGCTACAGATAATGGCGCAATGCCACATAAACTTGAAAAAGCTGACTTATTAAAAGCTAT^ 

ATCGCT/^ACGTTCACAGTAACGACGGCTACTTTGAGGTCATTGATTTTGCAAGCGATGCAACCATTAGTGATCGA^ 

GTGTACTTTGCTGAC/WkGATGGTTCGGTAACCTTGCCGACCCAACCTGTCC/\AGAATTT^ 

AGACCATATAAAGAAAAACCAGTAC/W^TCAAGCA/WVTCTGTTGATGTAGAATATACTGTACAGTTTAC^^ 

TGACGATTTCAGAGCAGGGCTCAAAGATACTAAGCTATTGAAAACACTAGCTATCGGTGACACCATCACATCTCAAGAArrACTA 

GGTCAAGCACAAAGCATTTTAAACAAAACCCAGCCAGGCTATACGATTTATGAACGTGACTGCTCAATCGTCACTC 

ACATTTTCCGTAGGATTTTACCAATGGATC/KAGAGTTTACTTACCATGTC/WWkTCGGGA^ 

ACAGGTATTAAAGAAAAAACGAACAACACTGATCTGGTCTCTGAGAAATATTACGTCCTTAAACAAGGGGAAAAGC 

CCTTTGATCGCAGTGACTTGAAACTGTTCACCATCAAATACGTTGATGTCAACACGAACGAATTGCTAAAAAGCGAGC^ 

AACAGCTAGCGAACGTAACTTAGACTTCAGAGATTTATAGGATCCTGGTGATAAGGCTAAACTACTCTACAACAATCTCGATGCT 

TTTGATATCATGGAGTATACCTT/KACTGGAAAAGTAGAGGATAATCACGATAAGAATAATCGTGTCGTTACAGTTTA 

AGGGCCGTAAAGGGGGAAAGGGTAGGTATGATTTAGCTTATGATAAAGATCTGTATAGGGAAGAAGAAGGAAAAGGTTAGAGGT 

ACCTGCGTGATACAGGGACACCTATAGGTGATAACCCTAAAGAC/VAAT/V^ 

SPy1983 
Seq ID 80 

ATGTTGACATCAAAGCACCATAATCTCAACAAACTAGTCTGGCGCTACGGGCTAACCTCAGCCGCTGGGGTGGTTCTAGCCTTT 

GGAGGCGGGGCAAGGAGCGTTAAGGCTGAGGTTTCTTCTACGACTATGAGGTCGAGTCAAAGAGAGTCAAAAATAAAAGAGAT 

GGAAGAAAGTCTTAAAAAATATCCAGAAGTGTCCAATGAGAAATTTTGGGAAAGAAAGTGGTATGGAACCTAI^ 

GATTTTCAAAAGGAGCTAAAAGATTTTACTGAGAAGAGGCTT/V^GGAGATTCTAGATTTAATTGGTAAATCTGGA^ 

AGGGTGGTGAGACTGGTCGTGCTGGCCCAGCCGGACCACAAGGTAAAACTGGTGAGAGGGGCGCCC/V*^GGTCCTAAAGGTGA 

GCGCGGTGAGGAAGGAATGCAAGGTAAAGCTGGTGAAAAAGGTGAGCGCGGTGAAAAAGGCGACAAAGGTGAAACCGGTGAA 

CGCGGTGAAAAAGGCGAAGCTGGAATCCAAGGCCCACAAGGTGAAGCTGGTAAAGATGGCGCTCGAGGTAAAGATGGAGCTC 

GAGGCGAAAAGGGTGAAAAAGGTGAGGGGGGTGAAACGGGAGGTGAGGGTGGAGTAGGCGGAGAAGGTG/WW\GGTGAAAG 

GGGCGCCCAAGGCCCAGCAGGCCCAC/s^GGTGAGGCAGGG/WKGCAGGTGAGC/iuflVGGCCCAGCAGGCCCACAAGGTGAAG 

CAGGCCAACCAGGCGAAAAAGCTCCAG/W^AGAGCCCAGAAGGCGAAGCAGGCC/\ACCAGGCG/WW^GCTCCAG/W\AGAG 

C/W\GAGGTAACTCCAGCTGCAG/WW^CCTGCTGACAAAG/^AGCT/sACCAAACGCCAG/VACGCCGCAATGGC/^^ 

AGACACCTGTAGCCAAC/\ACCACAGACGTCTACCAGC/s^CTGGTGAGC/\AGCCAACCCATTCTTTACAGCAGCAGCAGTAGCA 

GTGATGAC/s^CAGCTGGTGTCCTAGCCGrrAC/WSvACGC/W^G/\AAACAACTAA 

SPy1991 
Seq ID 81 

ATGATACTCTTAATTGAT/VATTACGATTCATTTACCTACAACGTCGCGCAATATTT/\AGTGAAT^ 

AACC/^GACCC/WS(CTTATATGACATGGCC/WWWVGCT/^ACGCTCTAGTCCTCTCACCTGGTCCTGGTTGGCCCAAGG/V\GCC 
AAGCAAATGCCAAAACTCATTG/WKGAGTTTTAGCAAAGAAAACCTATCTTAGGAGTGTGTGTGGGACAC 

GTTTAGGGGGAACCTTAGGGTTGGGGAAAGGGGTCATGCATGGGAGAGAAAGGAGGATTGAAACGG/^GGCCCTGCTAGTGTT 
TTTCGCTCCCTGCGAC/VVGAGATGAGGGTGATGGGCTACCATTCCATCGTTGTGGATCAGTTAGGAAAAGGTTTTAGCGTAAGG 
GCTAGAGACTGTGACGATGAAGAAATGATGGGATTTG/VACACCACACCGTGGGAGTTTTTGGGCTACAATTTGACCGAG/^ 
ATCGGAAGTGCTGATGGGATGACCATGATTGGGAACTTGATGGGAGCCATTCCGCGTTAA 

SPy2000 
Seq ID 82 

GTGTCAA/^ATAGCTAAAATACTTCTGTATTATCACGTTATTTTTGACTGGGCTTATTTTAGTTGCATGTCAACAAC 

AG/W\AGAACGTGAGCGCAAAGAAGGTGGAAAAGACG/\ACTTGTGGTTTCTATGGGGGCAAAGCTGCCTGATGAATTGGATC 

AAGGACCGTTATGGAGTCCAC/VVTGAAGGGAATATCACTCATAGCACTGTATTGAAACGTTCTCGTG/VACTAGATATAAAAGGAG 

AGGTTGCTAAAACATACCATCTCTCTGAAGATGGGCTGACTTGGTCGTTTGACTTGCATGATGATTTTAAATTCTCAAATGGTGA 

GCGTGTTACTGCTGATGATGTTAAGTTTACTTATGATATGTTG/WiiGCAGATGGAAAGGCTTGGGATCTAACCTTCATTAA 

GTTG/V^GTAGTTGGGAAAAATCAGGTCAATATCCATTTGACTGAGGCGCATTCGACATTTACAGCAGAGTTGACTGAAAT^ 

TGGTCCCTAAAAAACATTACAATGATAAGTAT/V^GAGC/KATCCTATCGGTTCAGGACCTTACATGGTAAAAGAATAT^ 

AGAACAAGCTATTTTTGTrCGTAACCCTTATTGGCATGGGAAAAAACCATACTTTAAAAAATGGACTTGGGTCTTACTT 

ACACAGCACTAGCAGCTTTAGAATCTGGTGATGTTGATATGATCTACGCAACGCCAGAACTTGCTGATAAAA/\AGTCAAAGGGA 

CCCGCCTCCTTGATATTCCATCAAATGATGTGCGGGGCTTATGATTACGTTATGTGAAAAAGGGCGTCATCACTGATTCTCCTGA 

TGGTTATCCTGTAGGAAATGATGTGAGTAGTGATGCAGCAATCCGAAAAGGCTTGACTATTGGTTTAAATAGGCAAAAAGTTCTG 

GATACGGTTTT/W\TGGTTATGGTAAAGCAGCTTATTC/SiATTATTGAT/W\AGAGGAT^ 

TAATAAAGTAGCTAAAGCTAAGCAATTATTGAGAAAAGCGGGATGGA/\AGAACAAGCAGAGGGTAGCCGTA/WW^ 

TGATGCAGCGTTTGATGTGTACTAGCCTAGT/Su^TGATCAATTGCGAGCG/^ACTTAGCCGTTG/^AGTAGCAGAGCAAGCC/^AGG 

CCTAGGGATTACTATTAAACTCAAAGCTAGTAACTGGGATG/W^TGGC/VSiCGAAGTCACATGAGTCAGCCTTACT^ 

AGGACGTCATCACGCGCAGCAATTTTATGAATCGCATCATCCAAGCCTAGCAGGGAAAGGTTGGAGCAATATTACGTTTTATAA 

C/^TCCTACCGTGACTAAGTACCTTGAC/W^GC/W\TGACATCTTCTGACCTTGAT/W\GGTAAGGAATATTGGAAGT^^ 

TGGGATGGCAAAACAGGTGCTTGTACTGTTGGAGATTTGCCAAATGTATGGTTGGTGAGCCTTAACGATACTTATATTGGTGAT^ 

AACGTATCAATGTAGGT/Ws^C/kAGGCGTCCACAGTCATGGTCATGATTGGTCATTATTGACTAACATTGCCGAGTGGACTTGGG 

ATGAATC/J^CTAAGTAA 

SPy2006 
Seq ID 83 

GTGAAGAAAACATATGGTTATATCGGCTCAGTTGCTGGTATTTTACTAGCTACTCATATTGGAAGTTACCAACTTGGTAAGC 

ATATGGGTTCAGGAAC/yV\GGACAATCAAATTGGCTATATTGATGATAGCAAAGGTAAGGCAAAAGGCCGTAA 

GATGGATCAAATCAGTGCTGAAGAAGGCATCTCTGCTGAACAGATCGTAGTCAAAATTACTGACCAAGGCTATGTGACCTCACA 

TGGTGACCATTATCATTTTTACAATGGGAAAGTTCCTTATGATGCGATTATTAGTGAAGAGTTGTTGATGACGGATCCTAATTAC^ 

GTTTTAAACAATCAGACGTTATC/V\TGAAATCTTAGACGGTTACGTTATTAAAGTCAATGGCAAGTATTATG 

GGTAGCAAGCGCAAAAACATTCG/W\CC/W\C/\AG/W\TTGGTGAGC/V\GTAGGCAAAGG/\ACTA^ 

TTAGCTC/^AGTGGGCCATCTGAGTAAAGAAGAAGTTGCGGCAGTGAATGAAGGAAAAAGACAAGGACGCTATAGTAGAGAGGAT 
GGGTATAi I i I I AGTGCGACAGATATCATTGATGATTTAGGAGATGGTTATTTAGTACGTCATGGT/iATGACTATGATTATATTGGT 
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AAAAAGGATTTGTCTCCAAGTGAGCTAGCTGCTGCACAAGCCTACTGGAGTCAAAAACAAGGTCGAGGTGCTAGACCGTCTGAT 
TACCGCCCGACACCAGCCCCAGCCCGAGGTCGTAGG/W\GCCCCAATTCCTGATGTGACGGCTAACCCTGGACAAGGTGATGA 
GCCAGATAACGGTGGCT ATCAT CCAGCGCCTCCTAGGCCAAATGATGCGTCACAAAACAAACACCAAAGAGATGAGTTTAAAG 
G/WW\CCTTTAAGGAACTTTTAGATCAACTACACCGTCTTGATTTG/W\TACCGT^^ 

ACCGACTC/s^GTGATC/WKTCAAACGCTTTTGGGTATGTGGTGCCTCATGGAGATCATTATCATATTATCCCAAG^ 

TCACCTCTTGAAATGGAATTAGCAGATCGATACTTAGCCGGCCAAACTGAGGACGATGACTCAGGTTCAGATCACTCAAAACCA 

TCAGATAAAGAAGTGACACATACGTTTCTTGGTCATCGCATCAAAGGTTAGGGAAAAGGCTTAGATGGTAAACCATATGATACGA 

GTGATGCTTATGTTTTTAGTAAAGAATCCATTCATTCAGTGGATAAATCAGGAGTTACAGCTAAACACGGAGATCATT^ 

ATAGGATTTGGAGAACTTGAACAATATGAGTTGGATGAGGTCGCTAACTGGGTGAAAGCAA/i^GGTCAAGCTGATGAGCTTGCT 

GCTGCTTTGGATCAGGAACAAGGCAAAGAAAAACCACTGTTTGACACTAA/\AAAGTGAGTCGCAAAGT 

GTGGGGTATATGATGCGAAA^.GATGGCAAGGACTATTTCTATGCTCGTGATC/y\CTTGATTTGACTCAGATTGGCTTTGC 

AAGAAGTAATGCTTAAAGATAAGAAACATTACCGTTATGACATTGTTGACACAGGTATTGAGCCAGGAGTTGCTGTAGATGTGTC 

AAGTCTGCCGATGCATGGTGGTAATGCTAGTTACGATAGTGGAAGTTCGTTTGTTATCCCTCATATTGATCATATCCATGTCGTT 

CCGTATTCATGGTTGACGCGCGATCAGATTGCAACAATCAAGTATGTGATGCAACACCCCGAAGTTCGTCCGGATATATGGTGT 

AAGCCAGGGCATGAAGAGTCAGGTTCGGTCATTGCAAATGTTACGCCTCTTGATAAACGTGCTGGTATGCCAAACTGGCAAATT 

ATCCATTGTGCTGAAGAAGTTCAAAAAGCCCTAGCAGAAGGTCGTTTTGC/\ACACCAGACGGCTATATT^ 

GTTTTGGCCAAAGAAACTTTTGTATGGAAAGATGGCTCCTTTAGCATCCCAAGAGCAGATGGCAGTTCATTGAGAACCATTAAT^ 

AATCTGATCTATCCCAAGCTGAGTGGCAACAAGCTCAAGAGTTATTGGCAAAGAAAAACGCTGGTGATGCTACTGATACGGATA 

AACCCAAAGAAAAGCAACAGGCAGAT/U\GAGGAATGAAAACCAACAGCGAAGTGAAGCCAGTAAAGAAGAAGAA 

GATGACTTTATAGACAGTTTACCAGACTATGGTCTAGATAGAGGAACCCTAGAAGATCATATCAATCAATTAGCACAAAAAGCTA 

ATATCGATGCTAAGTATGTCATTTTGGAACGAGAAGGTGTCCAATTTTATAAT/WWKTGGTGAATTGGTAACTTATGATATC/^ 

AGACTTCAAC/W^TAAACCCTTAA 

SPy2009 
Seq ID 84 

ATG CGTAG AGCAGAAAATAACAAACACAGCCGCTATTCCATTCGCAAACTGAGCGTTGGGGTAACGAGTATAGCAATTGCGAGT 
CTCTTTTTAGGAAAGGTTGCCTATGCGGTAGATGGCATCGCTCCAATCTGTCTTACTCAAAAGAGTAGAGCCACTACATCAGAAA 
ATTGGCATCATATTGATAAGGATGGCCTTATTCCTTTAGGTATAAGCTTAG/Sw^GCTGCCAAAGAGGAATTTA^ 
AG/\ATCACGTTTATCTG/iiAGCACAA/W\GAAACGTATAAAC/WW^ATTAAAACTGCACCAGACAA^ 

ATCATAGTGAGTATATGACAGCCGTTAAGGATCTTCCAGCGTCTACTGAGTCTACTACTCAGCCAGTTGAGGCACCCGTGCAGG 

AGACACAGGCATCAGCTTCAGATTCGATGGTGACAGGTGATTC/VACATCAGTTACGACTGATTCTCCTGAGGAAACGCCATCTT 

CGGAAAGTCCAGTGGCCCCAGCTTTATCTGAGGCTCCAGCTG/s^CCAGCTGAGAGTGAGGAACCTTCAGTAGCAGCATCTTCT 

GAGGAAACCCCATCTCCATC/kACTCCAGCGGCCCCAG/WVCTCCTGAAGAACCAGCAGCTCCATGTCCATCACCTGAGAGTGA 

GGAACCTTCAGTAGCAGCTCCTTCTGAGGAAACCCCATCTCCAG/kAACTCCTGAAGAACCAGCAGCTCCATCTCAACCAGCTGA 

GAGTGAAGAATCTTCAGTAGCAGCTACGAC/s^GCCCGTCTCCATC/^CTCCAGCTG/SiATCAGAGACTCAGACGGCACCAGCTG 

TTACTAAAGACTCTGATAAGCCATCTTCAGCAGGTGA/W\ACCAGCAGCCTCTTCACTTGTTTCAG/VAC/Ws^CCGT^ 

/V\CTTCAAAGAGATGTTCTGAT/WW\AGAAGAGG/VKGAACAGTCTTACTCTCCA/KATCGCTCATTGTCAAGACAGG^^ 

GATGAGTCAGGTAAGTAGTTGCGTTCAACAGGTGAAAAAGCACAGCCACTCTTTATAGCTACTATGACTTTGATGTCTCTATTTG 

GCAGTCTTTTAGTCAC/W\ACGCCAAA/\AGA/\ACTAA/W\ATAG 

SPy2010 
Seq ID 85 

TTGGGTAAAAAACAAAAATTACCATTTGATAAACTTGCCATTGCGCTCATGTCTACGAGGATCTTGCTCAATGGACAATCAGAC^^ 

TAAAGCAAATACTGTGACAGAAGACAGTGCTGCTAGCG/VACAAGCTGTAGAAACGCCACAACCAAGAGCGGTTTCTGAGGAAGG 

AGCATCATGAAAGGAAACCAAAACCCCAGAAAGTGCTGATGACGCAGAAGAAACAATAGCAGATGACGCT/VATGATCTAGCCCC 

TGAAGCTGCTGCTAAAACTGCTGATACACCAGCAAGCTCAAAAGCGACTATTAGGGATTTG/y\CGAGCCTTGTCAGGTCAAAA^ 

GCTGCAGGAAAAAGGAGGC/WkGGAGGTGGGACTGTTGTTGCAGTGATTGATGCTGGTTTTGATAAAAATGATG^ 

GCTTAACAGACAAAACGAAAGCACGTTAGGAATGAAAAGAAGATGTTG/WW\GCTAAAA/KAGAG 

AGTGGGTCAATGATAAGGTTGCTTATTACCACGACTATAGTAAAGATGGTAAAACCGCTGTCGATCAAGAGCACGGCACACACG 

TGTCAGGGATCTTGTCAGG/W\TGCTGCATCTGAAACGAAAG/VACCTTACCGCCTAGAAGGTGCGATGCCTGAGGCTCAATTG 

GTTTTGATGCGTGTCGAAATTGTAAATGGAGTAGGAGACTATGCTCGTAACTACGCTCAAGGTATCATAGATGCTGTCAACTTGG 

GAGGTAAGGTGATTAATATGAGCTTTGGTAATGCTGCAGTAGCCTATGCCAACCTTCCAGACGAAACCAAAAAAGCCTTTGACTA 

TGCCAAATC/W\AGGTGTTAGCATTGTGACCTCAGCTGGTAATGATAGTAGCTTTGGGGGCAAGACCCGTCTACCTCTAGCAGA 

TGATCCTGATTATGGGGTGGTTGGGACACGTGCAGCGGGAGACTCAACATTGACAGTTGCTTCTTACAGCCCAGATAAACAGCT 

CACTG/WVCTGCTACGGTC/WVACAGGCGATCAGCAAGATAAAGAAATGCCTGTTCTTTC/\AG/W\GCGTTTTGAGGCA^ 

GGCTTACGACTATGCTTATGCTAATCGTGGGATGAAAGAGGATGATTTTAAGGATGTC/W\GGT/V\GATTGGGGTTATTGAA 

GGCGATATTGATTTC/WS^GAT/kAGATTGCA/V^CGCTA/WW^GCTGGTGCTGTAGGAGTCTTGATCTATGACA^ 

GGCTTCCCGATTG/^TTGCC/WS^TGTTGATCAGATGCCTGCGGCCTTTATCAGTCG/WV^GATGGTCTCTTATTAAA^ 

CGCAAAAAAGCATCACCTTC/VATGCGAGACCTAAGGTATTGCCAACAGC/V^GTGGCACCAAACT/V\GCCGCTTCTC/V\GCTGGG 

GTCTGACAGCTGACGGC/KATATT/V^GCCAGATATTGCAGGACCCGGCC/VVGATATTTTGTCATCAGTGGCT/V\C^^ 

CC/W\CTTTCTGGAACTAGTATGTCTGCGGCATTAGTAGCGGGTATCATGGGACTGTTGC/W^GCAATATGAGACACAGTATC 

CTGAT ATGAC ACCATCAGAGCGTGTTGATTTAGCTAAAAAAGTATTGATGAGCTCAGC/\ACTGCCTTATATGATGAAGATGAA^ 

AGCTTATTTTTCTCCTCGCCAAC/^AGGAGCAGGAGCAGTCGATGCT/WW^AGCTTCAGCAGCAAGGATGTATGTGACAGAT 

GGATAATACCTCAAGCAAGGTTGAGCTGAACAATGTTTCTGATAAATTTGAAGTAACAGT/Su^CAGTTCAC/^ 

CTCAAGAGTTGTATTACCAAGCAACTGTTCAAACAGATAAAGTAGATGGAAAACTCTTTGCCTTGGCTGCTA/^ 

GACATCATGGCAAAAAATCACAATTCCAGCC/W\TAGCAGCAAACAAGTCACCATTCCAATCGATGTTAGTCAATTTAGC/^ 

TTGCTTGCCCCAATGAAAAATGGCTATTTCTTAGAAGGTTTTGTTCGTTTCAAACAAGATGC 

TCCCTATATTGGTTTCCGAGGTGATTTTGGCAATCTGTCAGGCTTAGAAAAACCAATCTATGATAGC/^GACGGTAGCAGCT^ 

TATCATGAAGCAAATAGTGATGCCAAAGACCAATTAGATGGTGATGGATTACAGTTTTACGCTCTGAAAAATAACTTTAGAGCAG 

TTACTACAGAGTCTAATCCATGGACGATTATTAAAGCTGTCAAAGAAGGGGTTGAAAACATAGAGGATATCGAATCTTCAGAGAT 

CACAGAAACCATTTTTGCAGGTACTTTTGCAAAACAAGACGATGATAGCCACTACTATATCCACCGTCACGCTAATGGCAAGCCA 

TATGCTGCGATCTCTCCAAATGGGGACGGTAACAGAGATTATGTCCAATTCCAAGGTACTTTCTTGCGTAATGGTAAAAAGCTTG 

TGGCTGAAGTCTTGGACAAAGAAGGAAATGTTGTTTGGAGAAGTGAGGTAACCGAGCAAGTTGTT/WWKCTACA^ 

TGGCAAGGACACTTGGTTCAACCCGTTTTG/WW\AGGCGTTGGGACGGTAAAGATAAAGAGGGGAAAGTTG 

GATACACCTATGGTGTTCGCTAGAGTCGGATTAGGTGAGGTGCAA/\AG/V\CAAGACACTGATTTTGATGTGATTGTAGAG/V\TAC 

GACACCTGAAGTCGC/\ACATCGGC/^ACATTGTC/VACAGAAGATGGTCGTTTGACACTTGCATCTAAACC/WW^CCAGCC/^ 
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GGTTTACCGTGAGCGTATTGCTTACACTTATATGGATGAGGATCTGCCAACAACAGAGTATATTTCTCCAAATGAAGATGGTACC 

TTTACTCTTCGTG/^AGAGGCTGAAAC/VATGG/^GGCGCTACTGTTCCATTGAAAATGTCAGAGTTTACTTATGT^ 

TGGCTGGT/^ACATCACTTATACACCAGTGAGTAAGCTATTGGAAGGCGACTCTAATAAACCAGAACAAGACGGTTCAGATCAAG 

CACCAGACA/WWKCCAG/W^CT/W^CCAGAACAAGACGGTTCAGGTCAAGCACCAGATAAAAAAGCAGAAACT/W\C^^ 

AAGACGGTTCAGGTC/W^CACCAGACA/WW^CCAGAAACT/W^CCAG/Su^CAAGACGGTTCAGGTC/W^CACCAGAT 

CAGA/VkCTA/iu^CCAGA/WV\GATAGTTCAGGTC/W^CACCAGGTA/W\CTCOTCA/W\AGGTCAACCT^ 

ACGATCTrCTAAGCGTGCTTTAGCTACAAAAGCATCAAC/Ws^GATCAGTTACCAACGACTAATGACAAGGATACA/i^ 

CATCTCCTTAAGTTAGTTATGACCACTTTCTTCTTGGGATTAGTAGCTCATATCTTTAA/V^CA^^ 

SPy2016 
Seq ID 86 

ATGAATATTAGAAATAAGATTGAAAATAGTAAAACACTACTATTTACATCCGTTGTAGCCGTGGCTGTACTAGGAG 

AGTTTCAGCCGAAACGTATACATCACGGAATTTTGAGTGGTCTGGAGATGAGTGGTGTGGAGATGACTGGCCTGAAGATGACTG 

GTCTGGAGATGGTTTGTCTAAATATGAGCGGTCTGGAGTTGGTTTGTCTCAATATGGCTGGTCTAAATATGGCTGGTCTAGCGA 

TAAAGAAGAATGGCCTGAAGATTGGCCTGAAGATGACTGGTCTAGCGATAAAAAAGATGAGACAGAAGATAAAACGAGACCACC 

ATATGGAGAAGCATTAGGTACAGGGTATGAAAAACGTGATGATTGGGGAGGACCTGGTACGGTGGC/KACTGACCCTTACACTC 

CACCATATGGAGGAGGATTAGGTACAGGGTATGAAAAACGTGATGATTGGGGAGGACCTGGTAGGGTGGG/\AGTGACCCTTAC 

ACTCGACCATATGGAGGAGCATTAGGTACAGGGTATGAAAAACGTGATGATTGGAGAGGACCTGGACATATTCCTAAACCTGAG 

AACGAACAATCAGCAAACCCACTTCATATTCCTGAACCTCCTGAGATTGAGTGGCCTCAGTGGAATGGGTTTGATGGATTATGAT 

TTGGCGCCTCTGATTGGGGCCAATCTGAGGACACCCCTCCAAGTGAACCTCGTGTGCCAGAAAAACCGC/KACATAGTCCTCAA 

AAAAATGCACAAGAATCAGATTTTGATAGAGGGTTTTCAGGTGGGTTGAAAGCAAAAAAGTGAGGTAGAGGTATTGATT^ 

GTTTCCAGTATGGTGGCTGGTCAGACGAATATAAAAAAGGTTACATGCAAGCCTTCGGTACACCATATACACCATCAGC/VSkCGT 

AA 

SPy2018 
Seq ID 87 

ATGGCTAAAAATAACACGAATAGACACTATTCGGTTAGAAAATTAAAAACAGGAAGGGGTTCAGTAGCGGTAGCTTTGACTGTTT 

TAGGGGCAGGTTTTGCGAATCAAACAGAGGTTAAGGCTAACGGTGATGGTAATCCTAGGG/iAGTTATAGAAGATCT^ 

AC/>iATCCCGCAATACAAAATATACGTTTACGTTACGAAAACAAGGACTTAAAAGCGAGATTAGAGAATGC/^^ 

GAAGAGATTTTAAGAGAGCTGAAG/\ACTTG/WWVGC/W\ACAAGCCTTAG/KAGACCAGCGTAAAGA^ 

AG/\ACTAC/KAC/\AGACTATGACTTAGC/WVGGAATC/\ACAAGTTGGGATAGACAAAGACTTGAAAAAGAGT^ 

GGAAGCTCTTG/s^TTAGCGATAGACCAGGCAAGTCGGGACTACCATAGAGCTACCGCTTTAGAAAAAGAGTTAGAAGAGAAA^ 

G/WVGCTCTTG/s^TTAGCGATAGACC/VAGCGAGTCAGGACTATAATAGAGCTAACGTCTTAG/WWVGAGTTAG/^ 

AGAG/VSiCAAGAGATTAATCGTAATCTTTTAGGGAATGCAAAACTTGAACTTGATC/\ACTT^ 

TCG/WW^GC/WV^CTTGAGG/i^GAAAAACAAATCTCAGACGC/iu^GTCGTCAAAGCCTTCGTCGTGAC^^ 

/\AGCTAAG/W\CAGGTTG/WWVGATTTAGC/W^CTTGACTGCTGAACTTGATAAGGTT/W^GAA^^ 

AAGCCGTCAAGGCCTTCGCCGTGACTTGGACGCATCACGTG/\AGCT/^GA/\ACAGGTTGAAAAAGATTTAGC/WSiC^ 

TG/W\CTTGATAAGGTT/W\GAAGAAAAACAAATCTCAGACGC/KAGCCGTCAAGGCCTTCGCCGTGACTTGGACGCATCACGTGA 

AGCTAAGAAACAAGTTG/WW\GCTTTAGAAGAAGCAAACAGCAAATTAGCTGCTCTTG/WW^CTT^ 

AGCAAGAAATTAACAGAAAAAGAAAAAGCTG/V\GTAC/V\GCAAAACTTGAAGCAGAAGCAAAAGCACTC^^ 

A/\ACAAGCTGAAGAACTTGCAAAACTAAGAGCTGG/W\AGCATCAGACTCAGAAACCCCTGATAC/WVKCCAGGA^ 

GTTCCAGGTAAAGGTC/VKGCACCAC/VAGCAGGTAGAAAACCTAACCAAAACAAAGCACCAATGAAGGAAACTAAGAGACAGTT^ 

CCATCAACAGGTGAAACAGCTAACCCATTCTTCACAGCGGCAGCCCTTACTGTTATGGCAACAGCTGGAGTAGCAGCAGTTGTA 

AAACGCAAAGAAGAAAACTAA 

SPy2025 
Seq ID 88 

ATGAAGAAAAGGAAATTGTTAGCAGTAACACTATTAAGTACGATACTCTTAAACAGTGCAGTGCCATTAGTTGTTGCTGA^^ 

CCTTGCGTAATAGCACATGATCCACTGATCAGCGTACTACAGCAGATACTGATACGGATGACGAGAGTGAAACACCAAAAAAAG 

ACAAAAAAAGCAAGGAAACAGCGTCGCAGCACGACACCCAAAAAGACCATAAGCCATCAGACACTCACCCAACCCGCCCTTCA 

AATGATACTAAGCAGACCGATCAGGCATCATCTGAAGCTACTGAGAAACCAAAT/W\GACAAAAACGACACCAAGCAAGCAGAC 

AGCAGTGATCAATCCACGCCATCTCCCAAAGACCAGTCGTGTCAAAAAGAGTCACAAAACAAAGACGGCCGACCTAGCCCATCA 

CGTGATGAGCAAAAAGATGAGACACCTGATAAAACACCAGAAAAATCAGCTGATAAAACCCCTGAAAAAGGACCAGAAAAAGGA 

AGTGATAAAACAGGAGAGCC/W^TCGTGACGCTGCAAAACCCATCCAACCTCCTTTAGCAGCTGCTCCTGTCTTTATACCTTGGA 

GAGAAAGTGACAAAGACCTGAGCAAGCTA/WVGGAAGGAGTCGCTCATCAGCGGCTTAGGTGAGAGAGTGGACAGGTGACTCT 

GCCTACACTCACAACCTGTTGTCACGCCGTTATGGGATTAGTGGTGAACAGCTAGATGGTTTTTTGAACAGTGTAGGTATTCACT 

ATGAT/W^G/\ACGCTT/VVACGGA/^GCGTTTATTAG/VJ^TGGGAAAAACT/\ACAGGACTAGACGTTCGAGCT^^ 

CAATGGGAGAAAGGTCAGTAGGTACTCAGGGAGTTGCTAAAGAAAAAGGAGCC/VS^TATGTTTGGTTATGGCGCCT^ 

ACCCAAAC/V\TGCCA/WW\TACAGCGATGAGGTTGCTATTCGTCACATGGTAGAAGACACCATCATTGCCAAC/WW^CC;^ 

CCTTTGAAAGAC/VS^GACCTCAAAGGAAAAAAATGGTCACTAGGCCAGTTGGATACCTTGATTGATGGTGGGGT^ 

ATAC/\AGTGGGAGTGGGCA/^AGACGAGCAGATATCATGACCAAAGTAGACC/SiATGGATAGATGATCATGG/\AGCACACCTGAG 

ATTCCAG/\ACATCTCAAGATAACTTCCGGGACAC/\ATTTAGCG/^GTGCGCGTAGGTTATA/W^GAAGTCAGCCACAA/^ 

TGACCTACAAGTCAGAGACCTACAGCTTTGGCCAATGCACTTGGTACGCCTATAATGGTGTCAAAGAGCTAGGTTATCAAGTCG 

ACAGGTACATGGGTAACGGTGGCGACTGGCAGCGCAAGCCAGGTTTTGTGACCACCCATAAACCT/W^GTGGGCTATGTCGTC 

TCATTTGGAGCAGGCCAAGCAGGAGCAGATGC/^ACCTATGGTCAGGTTGCTGTTGTAGAGCAAATCAAAGAAGATGGTTCTATC 

TTAATTTCAGAGTCAAATGTTATGGGACTAGGCACCATTTCCTATCGGACGTTCACAGCTGAGCAGGCTAGTTTGT^^ 

TCGTAGGGGACAAACTGCCAAGACCAT/iwS^ 

SPy2039 
Seq ID 89 

ATG/V\T/yyW^GAAATTAGGTGTCAGATTATTAAGTCTTTTAGCATTAGGTGGATTTGTTCT 

AAACTTTGCTCGTAACGAAAAAGAAGCAAAAGATAGCGCTATCAGATTTATCCAAAAATCAGCAGCTATCAAAGCAGGTGCACGA 

AGGGCAGAAGATATTAAGCTTGACAAAGTTAACTTAGGTGGAGAACTTTCTGGCTCTAATATGTATGTrrAG/^^ 

AGGATTTGTTATCGTTTGAGGAGATAAACGTTCTGGAGAAATTCTAGGATACTCTACGAGCGGATCATTTGACGGTAAGGGTAAA 

GA/W\GATTGGTTGGTTGATGG/W\GTTATGTGGAAGAAATG/W\G/W\AGAAAAAATTAGACAGTAGT^^ 

AGATT/WS^GAAGGAGTTGTT/W^TCTGTCCTTGATTC/WV^GGCATTCATTACAATCAAGGTAAGCGT^^ 
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GTTATTGAA/W\GTAAAACCAGGTGAACAATCTTTTGTAGGTCAACATGCAGCTACAGGATGTGTTGCTACTGCAACTGCTC^^ 

TTATGAAATATCATAATTACCCTAACAAAGGGTTGAAAGAGTACACTTACACACT/KAGCTCAAATAACCCATATTTCA^ 

AAGAACTTGTTTGCAGCTATCTCTAGTAGACAATACAACTGGAACAACATCTTACCTACTTATAGCGGAAGAGAATCTAAC 

AAAAAATGGCGATTTCAGAATTGATGGCTGATGTTGGTATTTCAGTAGACATGGATTATGGTCCATCTAGTGGTTCTGCA 

GTCTCGTGTTG/KAAGAGCCTrG/W\G/W\ACTTTGGCTAC/V^CCAATCTGTTCACCAA^ 

GATTGGGAAGCACA/\ATTGAC/WKGAATTATCTGAAAACCAACCAGTATAGTACCAAGGTGTCGGTAAAGTAGGCGGA 
TTTGTTATCGATGGTGCTGAGGGACGTAACTTCTACCATGTT/^AGTGGGGTTGGGGTGGAGTCTCTGACGGCTTCTTCGGTCT^ 
GACGCACTAAACCCTTCAGCTCTTGGTACTGGTGGCGGCGCAGGCGGCTTCAACGGTTACC/\AAGTGCTGTTGTAGGCATCAA 
ACCTTAG 

SPy2043 
Seq ID 90 

ATGAATCTACTTGGATGAA.GACGGGTTTTTTCTAAAA.^.TGTGGGCTAGTAAAATTTTCAATGGTAGCTCTTGTATCA 

GGCTGTAACAACAGTCACACTTGAAAATACTGCAGTGGCACGACAAACACAGGTCTCAAATGATGTTGTTCTAAATGATGGCGC 

AAGCAAGTACCT/W^CGAAGCATTAGCTTGGACATTCAATGACAGTCCCAACTATTACAAAACCTTAGGTACTAGTGAGATCACT 

CCAGCACTCTTTCCTAAAGCAGGAGATATTCTCTATAGCAA^TTAGATGAGTTAGGAAGGACGCGTACTGCTAGAGGTACATTG 

ACTTATGCCAATGTTGAAGGTAGCTACGGTGTTAGACAATCTTTCGGTAAAAATCAAAACCCCGCAGGCTGGACTGGAAACCC 

AATCATGTCAAATATAAAATTGAATGGTTAAATGGTCTATGTTATGTCGGAGATTTCTGGAATAGAAGTCATGTCAT^ 

TCTCGGTGGAGATGCACTCAGAGTCAATGCCGTTACAGGGACACGTACCCAAAATGTAGGAGGTCGTGACCAAAAAGGCGGCA 

TGCGCTATACCGAACAAAGAGCTCAAGAATGGTTAGAAGCAAATCGTGATGGCTATCTTTATTATGAAGCTGCTCCAATCTATAA 

CGCAGACGAGTTGATTCGAAGAGCTGTCGTGGTATCAATGGAATCTTCTGATAATACCATCAACGAGAAAGTATTAGTTTAGAAC 

ACAGCTAATGGCTAGACCATTAACTACCAT/\ACGGTAGACCTACTCAGAAATAA 

SPy2059 
Seq ID 91 

ATGAGATTTCTAGAACTTTTACAAAAGAAATTTTTTCCTAAAGCATATCAGGAAAAACAATTC 

ACGCCACAACACAATCAAAAGCAGTATTCGCCAAATGCC/UTCATTTGGACTCATCAGCTACCAAAAACTCAGAACAAGACCCT 

GGAACAGCTCTGCAACGCAGTAGAGCCTACGAAGGAAGCCCTAAAAGTCGGCCCGCTTGGTTGCAAAAGGTGGAAGCTGTTTT 

GCCGTCTCCTCAACGTCC/KATTCGGCGTTTTTGGGGCCGCTATCACATCGGAAAACTGCTAATGATTCTGATTGGAACTCTTGT 

CTTACTCTTAGGATCATACTTGTTTTACTTATC/WWiLCAGCT/\AAGTATCTGATTTACAAGATGCCT^ 

TTTATGATCAC/\AAGGAGAGTATGCAGGCAGTTTATCTGGTCAAAAAGGGAGTTATGTTGAGCTCAAGGCTATTT^^ 

TGAG/\ATGCTGTTATTGCCACTGAGGATAGGACI I I I I ACAGT/VATAGCGGTATT/\ATCTT/WiiCGCTTCTTATTGGCGGTAGTT 

ACGGCGGGCCGCTTTGGAGGTGGCTCAACGATTACACAGGAACTGGCT/WWKTGCTTATCTCTCACAAGATCAGAGAATTAAA 

CGAAAGGCCCGAGAGI II II 11 I GGCGTTAGAGTTGACCAAAAAATACAGTAAAAAAGATATTCTTACTATGTACCTT/SiACAACTC 

CTACTTTGGAAATGGAGTTTGGGGAGTTG/^AGATGCCAGTCAAAAATATTTTGGAACCACAGCTGCT^ 

GCTGCCACATTAGCAGGTATGCTC/Wi^GGACCTGAAATATATAACCCTTACCATTCTCTAAAAAATGCTACTCACCGTAGAGATA 
CTGTTTTAGGAGCGATGGTTGATGCC/WW^GATTACCCAAACAAAAGCTCAGCAAGCTAGAGCAGTAGGGCT/WS^^ 
TAGCTGATACTTATGTTGGT/V\GACAGATGACTACAAATACCCATCCTACTTTGATGCTGTTATTAGTGAAGCAATAGCAACT^^ 
GGTCTTTCAGAAAAAGACATTGTTAATAATGGATACAAAGTTTACACTGAGCTAGATC/^AAATTA 

CTTTTAAC/iACGATGAACTATTTCCTGTTTCAGCTTATGACGGTAGCTCTGCTCAAGCAGCTAGTGTTGCTTTAGA^ 

GGAGGTGTTAGAGGTCTGATTGGTCGTGTGAATAGTAGTGAAAATCCGACTTTCAGAAGTTTTAACTATGCGACTC/^ 

CGTAGTCCCGCATCAACAATCAAACCACTCGTGGTTTACGCGCCAGCCGTTGCTTCAGGATGGTCAATTGAAAAAGAACTACCA 

AATACCGTTC/V\GATTTCGATGGCTATCAGCCACATAATTATGGAAATTATGAATCAGAAGATGTTCCTATGTATC/VAGCATTAG 

AAACTCTTATAATATTCCAGCAGTTTCTACATTG/^CGATATGGGAATCGAT/WKGCCTTTACCTATGGT/WVVCA^ 

ATATGAGCTGTGCC/VAAAAAGAGTTGGGGGTAGGTTTAGGTGGGAGCGTGACAACCAATCCATTGGAGATGGCTCAGGCATAT 

GCTGCCTTTGCCAATAATGGAGTAATCCATCCTGCGCACTTGATTAACCGGATTGAAAATGCCAGGGGTGAAGTGCTTAAAACC 

TTTACTGATAAGGCTAAACGTGTTGTCAGCCAGTGTGTTGCAGATAAGATGACAGCCATGATGCTAGGTACCTTTTCAAATGGAA 

CAGCAGTCAATGCTAACGTATATGGCTATACACTAGCTGGT/WVACAGGGACGACAGAAACCAACTTCAATCCCGACTTAGCAG 

GCGATCAGTGGGTTATTGGTTATACGCCAGATGTTGTTATTAGTCAATGGGTAGGATTTAATCAGAGCGATGAAAATCATTATCT 

AACGGATTCAAGTGCAGGCACGGCCTCAGCTATTTTTAGCACTCAGGCATCTTACATTTTGCCTTATACCAAGGGCAGCCAAT^ 

CATGTAGATAATGGCTACGCTCAAAATGGTATTTCAGCTGTTTATGGAGTCAATGAAACAGGTAATCAATCAGGAGTTGATACTC 

/V\TCTATTATTGATGGTTTAAG/W^ATCAGCAC/\AG/^AGCTTCGCAATCACTATCAAAAGGAGTCGATCAGTC 

T/WKGCCCAATCTATTTGG/W!^GAGATTGTTGACTATTTTAGATAG 

SPy2110 
Seq ID 92 

ATGGTAAGTTTAGAAG/!^GAC/Su\GGTGAGTGTTGAACCTGATATTAAAGTGATTAAAGGAGATGGTCGCCTTGTTj^ 

gtac/wwktctatagtggtttattaaaagcaagcatg/wvgt/v\ctcggatgtcgccacttgttgaggctaa^ 

ttctgatggcattatagcag/ws^ttattgagggttttcc/v\ct/\atatc/wvktttatg/^ 

tgttgcagctaatgaatatgctattgcaaaag/vvtacattaattatcgtactcagcgtgactttgcaggt^ 

rcaa\ i 1 i ictattgat/wiittaatt/^t/w\gatcaaacagttgttaatg/w\atgctaacaaagata 

aacgagattt/i^ctgctggaatcgtaggga/siatcgattggttt/ww^tgttaccttcgcatgt^ 

agatatggattaccatgatttggattacagtccttatacagcgatgaggaactgctgtttaattgagtttaagggcatgt^ 

/\atggctttaaaattggtaatgctgaagtggaaagtccc/v\gtctattcaaactgcaacagctcagatctc^ 

atgtagcatcaagtcagtacggcggatgcacagctgatcgcattgacgagttrrtagccccatatgcggagctt/kacttt/^^ 

aacatatggctgatgctaagaaatggatcgttgagact/^agagagaaagctatgcttttgaaaagactc 

tgcgatggagtctttggagtatgaaattaatacgctctttacgtgtaatggtcaaacaccatttacttct^ 

ggacgtcttggtttg/vkggtgagattcaaaaagctattttgaccattgggattaatggtcttggtagtgaacatcgcacggctat 

tttccctaaattaattttcacggttaaacgtggcttgaatttagaaccagattcacgaaagtatga 

aatgtgcgactaagcggatgtacccggatatgttatcttatgataaa^ttattgatttgacaggatctttcaaatctcc/\^ 

atgccgctctttccttcaaggctggaaagatgaaaatgggcaagatgtgacctcaggccgtatg^atgttggggttgtcaccct 

caatttagctcgcattgccatggaatcaaatggcgatatggataagttttgggagctgtttaatgagaggatgctaattagtaag 

gatgcttt/v\tttatcgtgtcgaacgtgtcacagaagc/w\accagcaaatgctcctattctttatcaatatggtgcttt^ 

gcgtttggagaagacaggg/v\tgt/wvtgatctgtttaagaatcgtcgtgcaacagtctctcttggctatattggtcttta 

gtggcgtgtgttttttatggtggtc/\atggg/\aggt/\atccagatgct/w^gcttttacgttgtc/\attgtgaagggaatg/^ 

aggggtgtgaggattggtgagatgaatatggttatcatttctctgtttatrcgactccatcagaaagtttgacagatcgctt^ 
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TCGTTTAGATACGGAj^AAATTTGGCAT^^ 
AGAGTCCGACACCTTTTGA/W\ATTAGATT^ 

GTATCCTGTTTTGCAACAAAATCCAAAGGCCTTGGAAGCGGTTTGGGACTATGCTTATGATCGTGTGGGGTATTTGGGA^ 
TACGCCTATTGATAAATGCTATAATTGCCAATTTGAAGGGGATTTTACCCC/KACAGAACGTGGTTTTACT^ 
AAT/s^TGAGCCTAAAACAGTTGATGTGGTCAAACGTACATGTGGTTACTTGGGGAATCGTCAGGCCCGCCC/Sw^^ 
CGCCAT/^AGG/WS^TCTCTGCGCGTGTAAAACATATGAATGGTTCTACTATAAAATACCCAGGCCTGTAA 

SPy2127 
Seq ID 93 

ATGAGGAGGAATTACAGTAGAGTCATTGACGAACTGCGTAGTGACTACGGGCTGAATTTAGTTGCTATTGGTCAACGT^ 

ACCGACCCCCGA/^.CAGTTGGTAAATGGTGGCAGGGTAAACATAAGGCGAACCAAGAAAGCAGA/kAGAAACTG/^ 

TAGAGAGGTGAAAGAAACTATGATGAGACAAGTAAATATTTTTGAAGAAGCTAATGACAACACAAAGCAGGTTATGC/^ 

ACAACGACAAATTTTGATGGACAACCTTTAGACATTTACGGTGATATTCAGGAGCCTTTATTTTTGGCTAGGGCAGTCGCTGAAA 

TGATTGATTACACAAAAACTAGCCAAGGGTACTATGACGTACAAGCTATGCTAAGAAA^GTAGATGAGGATGAAAAGCTTA^ 

GAATGGCTTTAGAAGGTACTACGAAAAATTTTCGTAGTGGTCAAAAAGTTTGGTTTTTAA 

TATGCGTTC/W\CAAACCAAAAGCCAAAGAGTTTAGAAAAGCAGTCAAAAACATTGTAAAA 

ATGCAAGGCGAATTGGTGCAAGAACTAGCTCAACCAAGCACCCAAAAACTACCAGGTATAAGTGACCT/iufi^CTTATA^ 

AGCTAGCTGATTTAGTTGATATGGATAATCTAGCTGATATTTCAAATGGGATTGACCGAGTTCAGCAAGTAGTGAAGCTG^^ 

CTTGTAG 

SPy2191 
Seq ID 94 

ATGTTTAAG/WKGAAAATTTAAAACAAGGTTATTTTAATTTTGGATTAGT^ 

TTCTCAAGTAAAAATGCTGATACTAAGTGTTATGCTAAGAAGTCAGAAAGTAAAATGGTAACAATCGACAAGGGTCCA 

ATCATGCTATTACTAAAGAAGAAAGCAAAGAAAAAGCAAAGAGCATTGCTTCGGAGCCTATTCCCACAGTAGAAAACTCTGTAGC 

TCCGACAGTAACAGAGGAAGTACCGGTTGTTCAGCAAGAAGTGACTCAAACTGTTCAGCAGGTATCTTCAGTAGCCTATAATCC 

AAACAATGTGGTACTTTCCAATGGAAATACTGCTGGTATTGTAGGAAGTC/WGCGGCGGCACAGATGGGAGCAGCAAGAGGTG 

TTCCAC/\ATCAACTTGGG/\ACATAT/\ATTGGGCGTGAATCT/\ATGGAAATCCTAACGCAGCTAATGGTTCTGGGGCATCAGGGT 

TGTTCCAGACAATGCCAGGTTGGGGTTCTACAGCAACGGTTGAAGATCAAGTCAATGCAGCCTTGAAAGCGTATAGTGCAGAAG 

GTTTATCAGCTTGGGGTTACT/\A 

SPy2211 
Seq ID 95 

ATG/WW^TAATAATA/Su\TGGATAATTGCTGGACTTGCTAGTTTTTTGTTCCCTCTTA 
GGCATTTATTAT/\ATAGTGATAAAAC/\ATTCTAGCTAG 
ATGCACGGTTCTGATAGTTTTTTTTATACCTTTACAAGCGGACTAGGG^^ 
TTCTTTTCTCCATTACTTTTCTTTTTTAATTTAACCTCTATGCCAGATGCTATCTATTTG 

GATTAGCTGCATGCTATTCTTTTCATAGATTATATCCAAAAATCAGTGCTTTCTTGATGATTTCCATCTCAGTT^ 
TGAGCTTCTTGAC/V\GTCAAATGGAACT/W\TTCTTGGTTAGATGTTTTCATTCTTCTTCCACTTGTTATA^ 
TTATCACAGAAAATAAAAGCAGAAGTTATTATCTTTCGATATCATTATTATTCATTCAAAATTACTACTTTG^ 
TTTTTGTATTCTTTACGGCTTAGrrTGTCTTTTACGTCTCAATGATTTTAAG/V^ 

GTCAATATGTGCTGCTTTAACAAGTGCTCTAGTAATACTTCCTACCTATCTAGATTTGTCAACTTATGGAGAGAATCTATCCC 

TAAAACAGTTAGTTACGAACAATGCTTGGTTTTTGGATATACCTGCTAAGCTCTCAATAGGAGTGTAGGATACTACGAAGT^ 

GCTCTGCCTATGATTTACGTAGGATTATTTCCCCTAATGCTTAGTGTTATTTATTTTACTTTAGAAAGTATCCCTTT/V^^ 

TTAGCCAATGCCTGCTTGTTAACTTTTATTATAAT/\AGTTTTTACCTACAGCCACTTGATCTTTTTTGGCAGGGGATGCACTCACC 

AAATATGTTTTTGCATCGCTACGCTTGGTCTTTTTCCATAGTTATCCTATTACTCGCATGTGAGACTCTCTCTCGACTAAAAGAAG 

TGACTCAAATAA AAGCA GGTTTTGCTTTTATTTTCCTCATTATAGTGACATCTGTTCGT^^ 

ACGTTTAACTCTTTTTTTACTTAGTGTTTTTTTATTATTAGGTTATACTATl^ 

CTTTTATTTCTGCTTTCATACTTATCTTTAGCGTTCTTGAATCAGGGTTAAACACCTACTACCAGCTTCAAGGAATTAATAAGG 

T GGGGA TTCCCATCACGACAGATATATAATAGTCAATTAAAGGATATTAACAACCTTGTCAACTGTGTGTCAAAAAATAGTCAACC 

TTTTTTTAGAATGGAAAGGCTACTTCCCCAAACAGGGAACGATAGCATGAAATTTAATTATTACGGGA 

TAAGAAATAGACTATCTAGTTCTTTATTGGATCGATTGGGATTTCAGTCTAAAGGCACAAATTT/WkCCTTAGAT^ 

ACTATTATTATGGACAGTCTACTTGGTATAAAATATAATCTTAGCGAAGGACCTCCAAATAAATTTGGATTTACA^^ 

AGCGGG/KATACTACTCTTTATCAAAATCACTATAGTAGGCCTTTAGCTATATTAACACGTAATGTTTACAAAGATGTCAACCTAAA 

TGTCAATACCCTTGATAACCAAACCAAATTACTTAACCAACTAAGTGGGAAATCTTTAACCTATTTTA^ 

TTATTTCTGGTGCTAATCAATTTAACGGACAAATATCTGCACAAGCTTCTGATTATCAAAACTCCGTTACCC 

ACATCCCT/W^CATAGTC/VACTCTATGTTAGCATACCCAATATTATATTTTCAAATGCTGATGCTAAAGAGATGCGTATTCAGACA 

GATAATCAT/^ATTTCATATATACTACAGATAACGCTTACTC I M M II GATTTAGGATATTTCGCCGATGCCAAAGTTGCTACATTT 

TCGTTTGTTTTTCCAAAAAATAAACAAATTAGTTT^ 

AATAGC ATT/W^CAAA/WV^TGTTCATACTTACGCTAAAAGT/\ATAGGGT/\ATCACTGATTATAA 

Am I I lACACTTCCTTACGATAAAGGTTGGTCAGCAC/WW^GATGGGA/WiiATCTTCCAGTC/WWSw^GC 

TATCAGTTACTATTCCTAAAGGAAAGGGACGTGTTATCCTTACCTTTATTCCTAATGGTTTT/s^ 

TAGGAATTATCGCTTATATGCTTTTGTATAAGTACATAGATATAAAGTCTAAATTACTTTAG 

ARF0450 
Seq ID 96 

ttttccagaWctgcccacacggagggactattcctcgctatggtcagcctcttgtcgaaatgaacactacaactctcaa^^ 
gaccactt 

ARF0569 
Seq ID 97 

tcgtttatttgggaaaaaagaaaccccgaaggtagc 

ARF0694 
Seq ID 98 

aaggtgaggaaaagacagaagttacaaaagagaagcttttggaattggctagatggattaaagacatctcagacgataccgacgaaaagacagaagatgaggc^ 
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ARF0700 
Seq ID 99 



ARF1007 
Seq ID 100 



ARF1145 
Seq ID 101 
ccaataaL 

caagcccattccggaagtcaagccaatgcagcagcltatatggcct 

ARF1208 
Seq ID 102 

Wcggaacatctttlgtgactttagttgctgttcctctmgtatcctgttatcaaaaacttaaaagaaaaggttacaat^^ 

ARF1262 
Seq ID 103 

tgggttactgggtgtaaatggtggtttttttgc 

ARF1294 
Seq ID 104 



tcttcttcagactacactgtcg 

ARF1316 
Seq ID 105 

ccccgaaatggttggggacgttttgaacgltatgaaagacttggcagaacaaggcatgaccatgttaattgttactcacgaaatgggatttgctcgcc^ 
ARF1352 

Seq ID 106 ^ ^ 

ttgatgaactgcccatcattgcacttcttgcaacccaagcacaaggaacaacctgtattaaagatgctcaagaattacgagtcaaagaaacagatcg 

ARF1481 

aaaacaacgaagtgtaactacctcaaacgtcccaagctggtcgaatccagact^ 



taattgga 

ARF1557 
Seq ID 108 

ggcagaagactaccaccaagattacctcaagaaaaatccaaatggctactgccatat 

ARF1629 
Seq ID 109 

ttttggtcgccaggaagcagatatmgttagggacgcaaatgattgccaaaggactggatmcxjaaatgtgactttagttggggtacta 
cgtcagaaaaaacgtttcaattgttaacacaggttgctggtcgggcaggccgtgGcca 

ARF1654 
Seq ID 110 

tgcgtgccatcggtgaaatgctctattatgatccagatcaacacacctttatcaatWamccaaatacattggtccaggctg^ 



ARF2027 
Seq ID 111 



ARF2093 
Seq ID 112 

ttactgctgctgaagcagacatcaaagctgaacttgctgctgaaggcaaaccagaaaagatctgggacaaaatcatccc3ggtaaaatggaa:gd^^ 
accaagcttacactcttcttgcacaagtttacatcatggacgacagcaaaactgttgaagcttaccttgactcag 

ARF2207 
Seq ID 113 



CRF0038 
Seq ID 114 

ctgtacgctccacaatctctgtcaaatccttttcttgatagtattccatgtgaccag 

CRF0122 
Seq ID 115 
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tggtattggtttgaatgaaatcagctgcacgtttcaacccttctgcgtcaccaccaaa 

CRF0406 
SeqlD116 

caagcaccacttgacgatcacxsataacaagccaacatattggtcagggtatctt 

CRF0416 
Seq ID 117 

tatmctcaagaaaacaccmgaaggctgcaaaatcttggttgttgtcaccatttggagaaatggctaagacagggmccttgggra^^ 
gccaagtgtgtgtagaccattgtcagaaatcgttttggttgacacattggtctctgaaccaagagtgacaagtcctgtaaagtgm 

CRF0507 
Seq ID 118 



aagcggctatcagcaggtacaatctcccgaatgagagaagcaatcacatcaccaacaccagcagctgcaaaaatagctcc 

CRF0549 
Seq ID 119 

cmttatacatcgatcgcgmgatattggattttctcgtaatcaattmccttgtttgtccaaatttacgatgattt^^ 

CRF0569 
Seq ID 120 

tcgtttatttgggaaaaaagaaaccccgaaggtagc 

CRF0628 
Seq ID 121 

atcaaacatttgacccaagcaaaacaacggatgccagtatctaaggttctagtagctaaccccttagggtctaaaggaatagctgattca^ 



CRF0727 
Seq ID 122 



CRF0742 
Seq ID 123 



CRF0784 
Seq ID 124 



CRF0854 
Seq ID 125 

tcacctmacaaaatcagtmaacgccttctgcactcaaactttcttcgataaaacgtcctgtaaaaccacccaagaaccctgtcgccgtgct^ 
cacattaatgccmacctccagcaaacttatcatcactcgccatgcgattcactgacccaagatttatttggtcaattctgacaataaagtcaatagaagggt^^ 

CRF0875 
Seq ID 126 

gatcacaatatttattggcattatcaactaaaattgtcacaagttcaaacaatgacttttcctccgctt 

CRF0907 
Seq ID 127 

ccaaatctgctggaccattttctgccaaataacccgcatcaaaaccataaagcaaagctggat 

CRF0979 
Seq ID 128 

caatgtctcataattatcaataatatcaatmcgccataataaaaacactccttcgtgtatcttgaaaagagcgtctcta^ 

CRF1068 
Seq ID 129 

gaaaaactcttcagaacggcgcggcaacggtacaacttcaagtgggtttccaaaaagcaaatcatgggtctggtgcttatcgttttmgaagtcccg 
ttmctatcacatctttggccacattgacttgta:aacgcctccatcaacaataatcaagttaggagcttgcagtccct(x 
ataatcatctggtcccaccactgtmaatcttaaatttacgataatctttcttactaggtttgccatcaacaaaaacaaccatagccgra 
cttcaatgcgcacaggcttgtcaatcctaag 

CRF1152 
Seq ID 130 

gataactgtcgcggatgcctctcgacgaatctcgataacaaaacgc^ctccttcacggctagattcgtc^cgaactgctgtaataccttc 

CRF1203 
Seq ID 131 

ataggttccataaaccaactagtgagcttcgcmagtaacaatggatgaggtaaaaacattamaaaaccctaataactcctacttttgaggaatgc 

CRF1225 
Seq ID 132 
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taccaagtgtgtcagctgcctgagggtgttccccaactgatcgtaaacgcagaccaaacctggttttataaagcaaaaaccaa 

CRF1236 
Seq ID 133 

atacttgctaaaatagaaattagccaaaaaacactcataactataataataatgcgctca 

CRF1362 
Seq ID 134 




CRF1524 
Seq ID 135 

aaattcaatgttacgcttttgccactctgtcaaaacgagcttgtcatcactttgmctcmatctmgccatttac^^ 

egg 

CRF1525 
Seq ID 136 

gagacctctgctttgttagcagctcgcttcatacgctctgcggattctcccaaaataatcatctgcttaagtcctaaaaggtc^^ 
agcaatcaaaatcaagcgactgttatcaaaacctgataaagctttt 

CRF1527 
Seq ID 137 

aagagactggatgattgcaccctgatattcttctggtgtatcaatttgaac^cgctcaaac^gttcacatttgacacc^ 

CRF1588 
Seq ID 138 

gaatggctaagtcagcaacactctcatcaatttttacatgataagcgaagctatctggataccgcaffigataaaggtaaatg^^ 
tcttggc 

CRF1649 
Seq ID 139 

catgaccatttgtcacatcaacaatcgttgaagcaacttggaaatcttggtctggatagtaaacataatcaccagaatgataaatatte^ 

CRF1749 
Seq ID 140 

gtcatgttaaaagca(x:ggtcaaattgatWcaagacgcgctcaaaat<xtcttccgtcatmaagcatcaam^ 
tttcgatagcttcattgaccatacgtttagcttctgatgcttctgaaacata:ccagagatggtaacaactgtaacgccg 

CRF1903 
Seq ID 141 

agggaattatccagatttagtagctctgcaatcctgctaaaaccactagttaaggtgc«tcctaataggacaatggca<xtcctaatTO^ 
atggttgtttggcaaaaaacttttttagcaaagttacttgcaatt 

CRF1964 
Seq ID 142 

gagcacmcaaccaaatcaccaaattgggcaaaaatggaaaaaagagcgaccaaaac^aacatggtcaaaaagtgatgtggcgcataaac^ 

CRF2055 
Seq ID 143 

aaatcaaggtcmaaggacttcttcgatcacctcgtccacggttmtcctctccaactamtagtattaagcccatcacttgtte 
ccatggacatctcaaggggcaaccttttaaaaaaacggttgtccgaataccaggaccgtca 

CRF2091 
Seq ID 144 

cgaaatgaaatcaacacttcctctcctacgctttcttctaccaaaata 

CRF2096 
Seq ID 145 

tgctccttagcagcttttaccaaattttcaaaatcagccattgttccaaaatcagggttgaccgccgta 

CRF2104 
Seq ID 146 

gaggccatgtcttgccagcgttgctgttcttttttagctgactcttcWgatacgatatgamggccaagtcctccgtggtcate 
gtgtggg 

CRF2116 
Seq ID 147 

aatttcaaagctcgaccacctggtgttgmcaggamccaaacatcacaccaactttltcacgcaattggttaataaa^^^ 

CRF2153 
Seq ID 148 

aaagagacctgccactacagccaccaacatcatgtcattftccaagccaagttcgataggaactttttgccacatccagataaaagcggaca 



NRF0001 
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Seq ID 149 

attaccctgttgaccaagcaaacgcagcaactgttcaggaagcccagtctttcaaacaatctgttgaagcatctcttggtaaagagaatgtcattgtcaatgttctt^ 

aaacatcaactcacgaagcccaaggcttctatgctgagaccccagaacaacaagactacgatatcatttcatcatggtggggaccagactatcaagatccacggacctac 
cttgacatca 



NRF0003 
Seq ID 150 

tcgggaagacaggattcgaacctgcgacaccttggtcccaaaccaagtactctaccaagc 

SPy0012 
Seq ID 151 

MRKLL/i^MLMTFFLTPLPVISTEKKLIFSKNAVYQLKQDWQSTQFYNQIPSNPNLYQETCAYKDSDLTLPAGRLGVNQPLLI^ 

ESLPVFELADGTYVEANRQLIYDDIVLNQVDIDSYFVVTQKKLRLYSAPYVLGTQTIPSSFLFAQKVHATQMAQTNHGTYYLIDD 

SQEDLVQFDNRMLKVQEMLLQKYNNPNYSIFVKQLNTQTSAGINADKKMYAAS1SKU\PLYIVQKQLQKKKLAENKTLT^^ 

GDYDPLGSGKISKIADNKDYRVEDLLKAVAQQSDNVATNILGYYLCHQYDKAFRSEIKALSGIDWDMEQRLLTSRSAANMMEAIYHQK 

GQIISYLSNTEFDQQRITKNITVPVAHKIGDAYDYKHDVAIVYGNTPFILSIFTNKSTYEDITAIADDWGILK 

SPy0019 
Seq ID 152 

MKKRILSAVLVSGVTLGAATTVGAEDLSTKIAKQDSIISNLTTEQI<AAQNQVSALQAQVSSLQSEQDKLTARNTELEALSKRFEQEI^^ 

TSQIVARNEKLKNQARSAYKNNETSGYINALLNSKSlSDWNRLVAINRAVSANAKLLEQQKADKVSLEEKQ/s^NQTAINTIAANMA 

EENQNTLRTQQANLVAATANLALQLASATEDKANLVAQKEAAEI<AAAEAU\QEQ/\AKVKAQEQ/\AQC^ 

AQSSNAIEPAALTAPAAPSAGPQTSYDSSNTYPVGQCTWGAKSLAPWAGNNWGNGGQWAYSACW\GYRTGSTPMVGAIAVWND 
GGYGHVAWVEVQSASSIRVMESNYSGRQYIADHRGWFNPTGVTFIYPH 

SPy0025 
Seq ID 153 

MSSYFPVAPLSDLVSYMNKRIFVEKKADFGIKSASLVKELTHNLQLTSLKALRIVQWDVFNLAEDLLARAEKHIFSEQWDCLL^ 

AELDKVAFFAIEALPGQFDQRAASSQEALLLFGSDSQVKVNTAQLYLVNKDITEAELEAVKNYLLNPVDSRFKDITLPLEEQAFSVSDK 

TIPNLDFFETYQADDFATYKAEQGUKMEVDDLLFIQNYFKSIGCVPTETELKVLDTYWSDHCRHTTFETELKNIDFSA^ 

DKYIAMRDELGRSEKPQTLMDMATIFGRYERANGRLDDMEVSDEINACSVEIEVDVDGVKEPWLLMFKNETHNHPTEIEPFGGAATCl 

GGAIRDPLSGRSYWQAMRISGAGDITTPIAETRAGKLPQQVISKTAAHGYSSYGNQIGU^TTYVREYFHPGFVAKRMELGAWG^^ 

KENWREKPEAGDWILLGGKTGRDGVGGATGSSKVQTVESVETAGAEVQKGNAIEERKIQRLFRDGNNyTRLIKKSNDFGAGGVCVA 

IGEU\DGLEIDLDKVPLKYQGLNGTElAISESQERMSVWRPNDVDAFIAACNKENIDAVWATWEKPNLVMTWNGEn 

NGVRVWDAKWDKDLTVPEARTTSAETLEADTLKVLSDLNHASQKGLQTIFDSSVGRSTVNHPIGGRYQITPTESSVQKLPVQHGVT 

TTASVMAQGYNPYIAEWSPYHGAAYAVIEATARLVATGADWSRARFSYQEYFERMDKQAERFGQPVSALLGSIEAQIQLGLPSIGGK 

DSMSGTFEDLTVPPTLVAFGNn-TADSRKVLSPEFKAAGENIYYIPGQAISEDIDFDLIKDNFSQFEAIQAQHKITAASAAKYGG 

MTFGNRIGASVEIAELDSSLTAQLGGFVFTSAEEIADAVKIGQTQADFTVTVNGNDLAGASLLAAFEGKLEEVYPTEFEQTDVLEEVPA 

WSDTVIKAKETIEKPNAA'jPVFPGTNSEYDSAKAFEQVGASVNLVPFWLNEVAIAESVDTMVANIAKANIIFFAGGFS^ 

IVNILLNEKVRAAIDSFIEKGGLIIGICNGFQALVKSGLLPYGNFEEAGETSPTLFYNDANQHVAKMVETRIANTNSPWLAGVEVGDIHAI 

PVSHGEGKLWSASEFAELRDNGQIWSQYVDFDGQPSMDSKYNPNGSVNAIEGITSKNGQIIGKMGHSERWEDGLFQNIPGNKDQIL 

FASAVKYFTGK 

SPy0031 

Seq ID 164 

MKKFHRFLVSGVILLGFNGLVPTMPSTLISQQENLVHAAVLGDNYPSKWKKGNGIDSWNMYIRQCTSFAAFRLSSANGFQLPKGYGN 
ACTWGHIAKNQGYPVNKTPSIGAIAWFDKNAYQSNAAYGHVAWVADIRGDTVTIEEYNYNAGQGPERYHKRQIPKSQVSGYIHFKDL 
SSQTSHSYPRQLKHISQASFDPSGTYHFTTRLPVKGQTSIDSPDLAYYEAGQS\ATDK\An"AGGYTWLSYLSFSGNRRY!PIKEPAQS 
WQNDNTKPSIKVGDTNn-FPGVFRVDQLVNNLIVNKELAGGDPTPLNWIDPTPLDETDNQGKVLGDQILRVGEYFIVTGSYKVLKIDQP 
SNGIYVQIGSRGTWVNADKANKL 

SPy0103 
Seq ID 155 

MINQWNNLRHKKLKGFTLLEMLLVILVISVLMLLFVPNLSKQKDRWETGNAAWKLVENQAELYELSQGSKPSLSQL^ 
KAYQDYYDKHKNEKARLSN 

SPy0112 
Seq ID 156 

MKlGIIGVGKMASAIIKGLKQTPHELIISGSSLERSKEIAEQI^LPYAMSHQDLIDQVDLVILGIKPQLFETVLKPLHFKQPnSMAAGISLQR 
LATFVGQDLPLLRIMPNMNAQILQSSTALTGNALVSQELQARVRDLTDSFGSTFDISEKDFDTFTALAGSSPAYIYLFIEALAKAGVKNG 
IPKAKALEIWQTVUKSASNLKTSSQSPHDFIDAICSPGGTTIAGLMELERLGLTATVSSAIDKTIDKAKSL 

SPy0115 
Seq ID 157 

MTDLFSKIKEVTELDGIAGYEHSVRDYLRTKITPLVDRVETDGLGGIFGIRDSKAEKAPRILVAAHMDEVGFMVSDIKVDGTLRWGIGG 
WNPLWSSQRFTLYTRTGQVIPLISGSVPPHFLRGANGSASLPHIEDIVFDGGFTDKAEAERFGITPGDIIIPQSETILTANQKNIISKAWD 
NRYGVLMITEMLEALKGQDLNNTLIAGANVQEEVGLRGAHVSTTKFDPELFFAVDCSPAGDIYGNPGTIGDGTLLRFYDPGHVMLKD 
MRDFLLTTAEEAGVNFQYYCGKGGTDAGAAHLQNGGVPSTTIGVCARYIHSHQTLYAMDDFVEAQAFLQAIIKKLDRSTVDLIKCY 

SPy0166 
Seq ID 158 

MEDISDPEVILEYGWPAFIKGYTQLKANIEEALLEMSNSGQALDIYQAVQTLNAENMLLNYYESLPFYLNRQSILANMTK^^ 
AMAHYKLGEFAHYQDTMLDMVERTIKTF 

SPy0167 
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Seq ID 159 

MSNKKTFKKYSRVAGLLTAALIIGNLNATANAESNKQNTASTETTTTNEQPKPESSELTTEKAGQKTDDMLNSNDMIKI^^ 

EKEEKKSEDKKKSEEDHTEEiNDKlYSLNYNELEVLAKNGETIENFVPKEGVKKADKFIVIERKKKNlNTTPVDISllDSVTDRTYPAALQL 

ANKGFTENKPDAXAn-KRNPQKlHIDLPGMGDKATVEVNDPTYANVSTAIDNLVNQWHDNYSGGNTLPARTQYTESMWSKSQIEAA^ 

NVNSKILDGTLGIDFKSISKGEKKVMIAAYKQIFYTVSANLPNNPADVFDKSVTFKELQRKGVSNEAPPLFVSNVAYGRTVFVKLETSSK 

SNDVEAAFSAALKGTDVKTNGKYSDILENSSFTAWLGGDAAEHNKWTKDFDVIRNVIKDNATFSRKNPAYPISYTSVFLKNNKIAGV 

NNRTEYVETTSTEYTSGKlNLSHQGAYVAQYEILWDEINYDDKGKEVITKRRWDNNWYSKTSPFSTVlPLGANSRNiRIMARECTGl-A 

WEVWVRKVIDERDVKLSKEINVNISGSTLSPYGSITYK 

SPy0168 
Seq ID 160 

MKQQSYQPLRFWLLVALF/^LLLIARPVMADEGTNSADAAYYKGQSAGKKAGKKAGKEATmDLTPTVPTNPETPSDIGETTNKQL 
YKEGYKDGYKEGYNEGWKSQYPVLTPVKVIWDLISYWLQRLFPNNQSSTAAQSMS 

SPy0171 
Seq ID 161 

VKNKLFLVALAT\m^LGPSLATPHHQWHASD\n-LTETGDKNGTVCFGYENVDGEVCKLTADGKGTICVG^ 
SNWFWGFLNYLWTTIKSWVS 

SPy0183 
Seq ID 162 

METILEVKHLSKIFGKKQKAALEMVKTGKNKSEIFKKTGAWGWDASFEVKKGEIFVIMGLSGSGKSTLVRMLNRLIEPSAGSILL^ 

DISTMSADQLREVRRHDINMVFQSFALFPHKT1LENTEFGLELRGVPKEERQRLAE1<ALDNSGLLDFKDQYPNQLSGGMQQRVGLAR 

ALANSPKILLMDEAFSALDPLIRREMQDELLDLQDSMKQTIIFISHDLNEALRIGDRIALMKDGQIMQIGTGEEILTNPANDFVREFVEDV 

DRSKVLTAQNIMIKPLTTTVELDGPQVALNRMHNEEVSMLMATNRRRQLVGSLTADAAIEARKKGLPLSEVIDRDVRTVSKDTIITDILP 

LIYDSSAPIAVTDDNNRLLGVIIRGRVIEALANISDEDLN 

SPy0230 
Seq ID 163 

MKTARFFWFYFKRYRFSFWIAVAVILATYLQVKAPVFLGESLTELGKIGQAYYVAKMSGQTHFSPDLSAFNAVMFKLLMTYFFTVL^^ 

LIYSFLLTRWSHSTNRMRKGLFGKLERLTVAFFDRHKDGEILSRFTSDLDNIQNSLNQSLIQWTNIALYIGLNAVMMFRQDSRL^^ 

STPVALIFLVINIRLARKYTNIQQQEVSALNAFMDETISGQKAIIVQGVQEDTMTAFLKHNERVRQATFKRRLFSGQLFPVMNGMSLINT 

AIVIFVGSTIVLSDKSMPAAAALGLVVTFVQYSQQYYQPMMQIASSWGELQLAFTGAHRIQEMFDETEEVRPQNAPAFTSLKEAVAIN 

HVDFGYLPGQKVLSDVSIVAPKGKMIAWGPTGSGKTTIMNLINRFYDVDAGSITFDGRDIRDYDLDSLRQKVGIVLQESVLFSGTITDN 

IRFGDQTISQDMVETAARATHIHDFIMSLPKGYNTYVSDDDNVFSTGQKQLISIARTLLTDPEVLILDEATSNVDTVTESKIQRAMEAIVA 

GRTSFVIAHRLKTILNADHMVLKDGKVIEQGNHHELLHQKGFYAELYHNQFVFE 

SPy0269 
Seq ID 164 

MDLEQTKPNQVKQKIALTSTIALLSASVGVSHQVKADDI^SGETKASNTHDDSLPKPETIQEAKATIDAVEKTLSQQKAELTELA^ 

KTTAEINHLKEQQDNEQKALTSAQEIYTNTL^SSEETLLAQGAEHQRELTATETELHNAQADQHSKETALSEQKASISAETTI^ 

QVKTSEQNIAKLNAMISNPDAITK/SJ^QTANDNTKALSSELEKAKADLENQKAKVKKQLTEEU^AQ^ 

IVGNNTMKAPQGYPLEELKKLEASGYIGSASYNNYYKEHADQIIAKASPGNQLNQYQDIPADRNRFVDPDNLTPEVQNELAQFAAHMI 

NSVRRQLGLPPVTVTAGSQEFARLLSTSYKKTHGNTRPSFVYGQPGVSGHYGVGPHDKTIIEDSAGASGLIRNDDNMYENIGAFNDV 

HTVNGIKRGIYDSIKYMLFTDHLHGNTYGHAINFLRVDKHNPNAPVYLGFSTSNVGSLNEHFVMFPESNIANHQRFNKTPIKAVGSTKD 

YAQRVGTVSDTIAAIKGKVSSLENRLSAIHQEADIM/VAQAKVSQLQGKLASTLKQSDSLNLQVRQLNDTKGSLRTELUVKKAKC^ 

ATRDQSIj^KLASLKAALHQTEALAEQAAARWALVAKKAHLQYLRDFKLNPNRLQVIRERIDNTKQDLj^KTTSSLLNAQ 

SSLEATIATTEHQLTLLKTIJ\NEKEYRHLDED1ATVPDLQVAPPLTGVKPLSYSKIDTTPLVQEMVKETKQLLEASARLAAENT^ 

LVGQTSEMVASNAIVSKITSSITQPSSKTSYGSGSSTTSNLISDVDESTQRALKAGWMLAAVGLTGFRFRKESK 

SPy0287 
Seq ID 165 

MTKEKLVAFSQAHAEPAWLQERRLAALEAIPNLELPTIERVKFHRWNLGDGTLTENESLASVPDFIAIGDNPKLVQVGTQTVLEQLPM 

ALIDKGWFSDFYTALEEIPEVIEAHFGQALAFDEDKLAAYHTAYFNSAAVLYVPDHLEITTPIEAIFLQDSDSDVPFNKHVLVIAGKESKF 

TYLERFESIGNATQKISANISVEVIAQAGSQIKFSAIDRLGPSVTTYISRRGRLEKDANIDWALAVMNEGNVIADFDSDLIGQGSQADLKV 

vaassgrqvqgidtrvtnygqrtvghilqhgvilergtltfngighilkdakgadaqqesrvlmlsdqaradanpillidenevtagh 
aasigqvdpedmyylmsrgldqetaerlvirgflgaviaeipipsvrqeiikvldekllnr 

SPy0292 
Seq ID 166 

mikrlislwialff/^stvsgeeysvtakhaiavdlesgkvlyekdakewpvasvskllttylwkevskgklnwdsp\n-|snypy 

tnytisnvpldkrkytvkellsalwnnanspaiaij\ekiggtepkfvdkmkkqlrqwgisdakwnstgltnhflgantypntepdd 

encfcatdlaiiarhlllefpevlklssksstifagqtiysynymlkgmpcyregvdglfvgyskkagasfvatsvenqmrvitwlna 

dqsheddlaifkttnqllqyllinfqkvqliennkpvktlyvldspektvklvaqnslffikpihtktknwhitkksstmiaplskgqvl^ 

ratlqdkhligqgyldtppsinlilqknisksfflkvwwnrfvryvntsl 

SPy0295 
Seq id 167 

mesidkskfrfverdseasevidtpaysywksvfrqffskkswfmlvilwvlmmsfiypmfanydfndvsnindfskryiwpnaey 
wfgtdkngqslfdgxaat/garnsilisviatunitigvvlgaiwgvskafdkvmieiynmsnipsmliiivlt^^ 
vaysirvqilryrdleynlj^sqtlgtpmykiavknllpqlvsvimtmlsqmlpvyvsseaflsffgiglptttpslgrfianyssnlt^ 
ylfwiplvtlilvslplyivgqnladasdprshr 

SPy0348 
Seq ID 168 

Lj^ltdfkdkdqqdqqrsfkeqiu\elekanqirkekeeelfqkeleake/v^rrtaqly/^eykrqdafqkesiahnnktakhfqaikga 
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VMTSEALKPTLLSEKENSSLKTTNKRWQANELQETASKESQVPLTIEKGHSVRRKLSKRQQTERAAKKISWLISSlllTLLAW 

YVYSALNPVDKNSDAFVQVElPSGSGNKLlGQILQKKGLiKNSTVFSFYTKFKNFTNFQSGYYNLQKSMSLEEIASALQEGGTAEPTKP 

SLGKILIPEGYTIKQIAKAVEHNSKGKTKKAKTPFNEKDFLDLWDEAFIQDMVKRYPKLI^TIPTKEKAIYRLEGYLFPATYNYYKETTM 

RELVEDMLAAMDATLVPYYDKIAASGKTVNEVLTlj^SLVEKEGSTDDDRRQIASVFYNRLNSGMALQSNIAlLYAMGKLGEKTTLAE 

TIDTTINSPYNIYTNTGLMPGPVASSGVSAIEATLNPASTDYLYFVANVHTGEVYYAKTFEEHSANVEKYVNSQIQ 

SPy0416 
Seq ID 169 

VEKKQRFSLRKYKSGTFSVL1GSVFLVMTTWAADELSTMSEPTITNHAQQQAQHLTNTELSSAESKSQDTSQITLI<TNREKEQSQDL 

VSEPTTTELADTDAASMAfSiTGSDATQKSASLPPVNTDVHDWVKTKGAWDKGYKGQGKWAVlDTGIDPAHQSMRISDVSTA!^^^^ 

EDMUKRQKAAGINYGSWINDKWFAHNYVENSDNIKENQFEDFDEDWENFEFDAEAEPKAIKKHKIYRPQSTQ^ 

HDIDmQTDDDTKYESHGMH\n'GIVAGNSKEAAATGERFLG!APEAQVMFMRVFANDIMGSAESLFIKAlEDAVALGADVINLSLGTA 

NGAQLSGSI<PLMEAlEKAKKAGVSVW/KAGNERWGSDHDDPlj^TNPDYGLVGSPSTGRTPTSVAAINSKWVI^ 

LNHGKAIYSESVDFKDIKDSLGYDKSHQFAYVKESTDAGYNAQDVKGKIALIERDPNKTYDEMIALAKKHGALGVLIFNNKPGQSNRS 

MRLTANGMGIPSAFISHEFGKAMSQLNGNGTGSLEFDSWSKAPSQKGNEMNHFSNWGLTSDGYLKPDITAPGGDIYSTYNDNHYG 

SQTGTSMASPQIAGASLLVKQYLEKTQPNLPKEKIADIVKNLLMSNAQIHVNPETKTTTSPRQQGAGLLNIDGAVTSGLYVTGKDNYG 

SlSLGNITDTMTFDXrrVHNLSNKDKTLRYDTELLTDHVDPQKGRFTLTSHSLKTYQGGEXn'VPANGKV/rVRVTMDVSQFTl^^ 

NGYYLEGFVRFRDSQDDQLNRVNIPFVGFKGQFENLAVAEESIYRLKSQGKTGFYFDESGPKDDIYVGKHFTGLVTLGSETNVSTKTI 

SDNGLHTLGTFKNADGKF1LEKNAQGNPVLAISPNGDNNQDFAAFKGVFLRKYQGLI<ASWHASDKEHKNPLWVSPESFKGDKNFN 

SDIRFAKSTTLLGTAFSGKSLTGAELPDGHYHYWSYYPDWGAKRQEMTFDMILDRQKPVLSQATFDPETNRFKPEPLKDRGLAGV 

RKDSVFYLERKDNKPYTVT1NDSYKYVSVEDNKTFVERQADGSFILPLDKAKLGDFYYMVEDFAGNVAIAKLGDHLPQTLGI<TPIKLKL 

TDGNYQTKETLKDNLEMTQSDTGL\n"NQAQLJKWHRNQPQSQLTKMNQDFFISPNEDGNKDFVAFKGLKNNWNDLTVNW 

QKQTP1WSSQAGASVSAIESTAWYG!TARGSKVMPGDYQY>^YRDEHGKEHQKQYTISVNDK1<PMITQGRFDTINGVDHF 

ALDSSGIVREEVFYLAKKNGRKFDVTEGKDGITVSDNKVYIPKNPDGSYTISKRDGVTLSDYYYLVEDRAGNVSFATLRDLKAVGKDK 

AWNFGLDLPVPEDKQIVNFTYLVRDADGKPIENLEYYNNSGNSLILPYGKYTVELLTYDTNAAKLESDKIVSFTLSADNNFQQVTFKIT 

MLATSQITAHFDHLLPEGSRVSLKTAQDQLIPLEQSLYVPKAYGKTVQEGTYEVWSLPKGYRIEGNTKVNTLPNEVHELSLRLVKVGD 

ASDSTGDHKVMSKNNSQALTASATPTKSTTSATAKALPSTGEKMGLKLRIVGLVLLGLTCVFSRKKSTKD 

SPy0430 
Seq ID 170 

MKWSGFMKTKSKRFLNLATLCLALLGTTLLMAHPVQAEVISKRDYMTRFGLGDLEDDSANYPSNLEARYKGYLEGYEKGLKGDDIPE 

RPKIQVPEDVQPSDHGDYRDGYEEGFGEGQHKRDPLETEAEDDSQGGRQEGRQGHQEGADSSDLNVEESDGLSVIDEWGVIYQA 

FSTIWTYLSGLF 

SPy0433 
Seq ID 171 

MKKTLTLLLALFAIGWSSVRAEDEQSSTQKPVKFDLDGPQQKIKDYSGNTITLEDLYVGSKWKIYIPQGVWVVYLYRQCDH^ 

LASPILEKNITKTDPYRQYYTGVPYILNLGEDPLKKGEKLTFSFKGEDGFYVGSYIYRDSDTIKKEKEAEEALQKKEEEKQQKQLEESM 

LKQIREEDHKPWHQRLSESIQDQWWNFKGLFQ 

SPy0437 
Seq ID 172 

MKI<TLTLLU\LFAIG\n*SSVRAEDEQNKFILDGLQEKVKEVSVSDFSVGESKlK\M/LPQAWSVKlSREHSPKSSISNSGE 

NKEGQFSKRLPYGTQHTIKLSSQLTKGERVTLTFRDEDFWGAGYCFYRDSLSIKEDKQYEEEIKKIEDDLERQDLENDALEMFKKQTE 

REANKPWHQRLSENIQDQWWNFKGLFQ 

SPy0469 
Seq ID 173 

MIITKKSLF\rrSVALSLVPLATAQAQEWTPRS\n"EIKSELVLVDNVFTYTVKYGDTLSTIAEAMGIDVHVLGDlNHIANlDLIFPD 

NQHGQATNLTVQAPASSPASVSHVPSSEPLPQASATSQPTVPMAPPATPSDVPTTPFASAKPDSSVTASSELTSSTNDVSTELSSES 

QKQPEVPQEAVPTPKAAETTEVEPKTDISEAPTSANRPVPNESASEEVSSAAPAQAPAEKEETSAPAAQKAVADTTSVATSNGLSYA 

PNHAYNPMNAGLQPQTAAFKEEVASAFGITSFSGYRPGDPGDHGKGLAlDFMVPENSALGDQVAQYAIDHMAERGiSYVIWKQRFY 

APFASIYGPAYTWNPMPDRGSITENHYDHVHVSFNA 

SPy0488 
Seq ID 174 

LRQIQSIRLIDVLELAFGVGYKEETTSQFSSDQPSQWLYRGEANTVRFAYTNQMSLMKDIRIALDGSDKSLTAQIVPGMGHVYEGFQT. 
SARGIFTMSGVPESTVPVANPNVQTKYIRYFKVIDDMHNTMYKGTVFLVQPCMWKYTMKSVDQLPVDDLNHIGVAGIERMTTLIKNAG 
ALLTTGGSGAFPDNIKVSINPKGRQATITYGDGSTDIIPPAVLWKKGSVKEPTEADQSVGTPTPGIPGKFKRDQSLNEHEAMVNVEPLS 
HWKDNIKVIDEKSTGRFEPFRPNEDEKEKPASDVKVRPAEVGSWLEPATALPSVEMSAEDRLKS 

SPy0515 
Seq ID 175 

MKVLLYLEAENYLRKSGIGI^IKHQAKALSLVGQHFTTNPRETYDLVHLNTYGLKSWLLMIK^ 

LLSPWFKKYLCHFYNKADAIITPTLYSKSLIESYGVKSPIFAVSNGIDLEQYGADPKKEAAFRRYFDIKEGEKNA/MGAGLFFLRKGI^ 

VKVAQAMPDVRFIWFGETN1<WVIPAQVRQMVNGNHPKNLIFPGY1KGDWEGAMTGADAFFFPSREETEGIVVLEALASRQHLVL^ 

PVYYGWVDQSSAELATDIPGFIEALKKVFSGASNKVEAGYKVAQSRRLETVGHALVDWKKVMEL 

SPy0580 
Seq ID 176 

MENNNNHNIAEALSVSLHQIEQVU^LTAQGNTIPFIARYRKEWGNLDEWIKSIIDMDKSLTTLNERKATILAKIEEQGKLTDQL^^ 

TEKLADLEELYLPYKEKRRTKATIAREAGLFPL7\RL1LQNAQNLETAAEPFWEGFASPQEALAGAVDILVEAMSEDAKLRSVVTYNEIW 

QYSRLVSTLKDEQLDEKKVFQIYYDFSDQVSNMQGYRTLALNRGEKLGILKVSFEHNLEKMQRFFSVRFKETNPYIEEVINQTIKKKIV 

PAMERRVRSELSDAAEDGAIHLFSENLRHLLLVSPLKGKMVLGFDPAFRTGAKLAIVDQTGKLLTTQVIYPVAPASQTKIQAAKETLTQ 

LIETYQIDIIAIGNGTASRESEAFVADVLKDFPNTSYVIVNESGASWSASELARHEFPDLTVEKRSAISIARRLQDPLAELVKIDPKSIGV 

GQYQHDVSQKKLSENLGFWDTWNQVGVNVNTASPSLU^HVSGLNKTISENIVKYREENGALTSRADIKKVPRLGAKAFECW^ 
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IPGAKNILDNTGVHPESYPAVKELFKVLGIQDLDDAAKATLAAVQVPQMAETLAIGQETLKDIIADLLKPGRDLRDDFEAPILRQDILDLK 
DLEIGQKLEGWRNWDFGAFVDIGVHEDGLIHISEMSKTFVNHPSQWSVGDLVTVWVSKIDLDRHKVNLSLLPPRDTH 

SPy0621 
Seq ID 177 

MNEKVFRDPVHNYIHlDNPLIYDLINTKEFQRLRRIKQVPTTAFTFHGAEHSRFSHCLGWEIARRNn-AIFEEKYADlWNKDESLNATMT^ 

ALLHDIGHGAYSHTFEVLFHTDHEAFTQEIITNPETEINAILVRHAPDFPDKVASVINHTYPNKQWQLISSQIDCDRMDYLLRDSYFSAA 

NYGQFDLMRILRVIRPVEDGIVFEHSGMHAVEDYIVSRFQMYMQWFHPASRAVELILQNLLKRAQHLYPEQQAYFQKTAPGLIPFFE 

KKANLOiDYIALDDGVMNTYFQVWMASEDHILSDIJ^SRFlNRKILKSWFDQDSQGELERLRQLVESVGFDPDYY^ 

PELENPRTQIEMMQKDGSLAELSQLSPIVKALTGTTYGDRRFYFPKEMLELDDLFAPSKETFMSYISNGHFHFSQ 

SPy0630 
Seq ID 178 

MDINLLQALLIGLWTAFCFSGMLLGIYTNRCIILSFGVGIILGDLPTALSMGAISELAYMGFGVGAGGWPPNPIGPGIFGTLMAITSAGKV 
TPEAAU\LSTPIAVAIQFLQTFAYTAFAGAPETAKKQLQKGNIRGFKFAANGTIWAFAFIGLGLGLLGALSMDTLLHLVDYIPPVLLNGLT 
VAGKMLPAIGFAMILSVMAKKELIPFVLIGYVCAAYLQIPTIGIAIIGIIF/M-NEFYNKPKQVDATTVQGGQQDDWI 

SPy0681 
Seq ID 179 

LTPRSGKTTAGHFRYARYLIESEDENHLVTAYNQEQAYRLFIDGDGTGLMHIFDGNCEIKHDERGDHLLITTPKGNKRVYYKGGGKVN 
SVGAITGMSLGSWFCEINLLHMDFIQEGFRRTWAAKLRYHLADLNPPAPQHPVIKDVFDVQNTRWTHmMDDNPILTAERKQNIINS 
LKKNPYLYKRDVLGQRVMPQGVIYGLFDTEKNVLDALIGEPVEMYFCADGGQSDATSMSCNIVTRVRDNGRISFRLNRVAHYYHSGA 
DTGQVI<AMSWALELKVFIDWCVKKYQMRYTEVFVDPACKSLREELHKLGVFTLGAPNNSKDVSSKAKGIEVG1ERGQNIISDGAFYL 
VNHSEEEYDHYHFLKEIGLYSRDDNGKPIDKDNHAMDEFRYSVNVFVHRYYN 

SPy0683 
Seq ID 180 

MKKKPIKLNDEQLLLEASQLSDMYHQLTLDLFDQVIERIKARGSASLADNPYLWQANKLHDVGLLNADNIKLIAKYSGIAEAQLRYIIKNE 

GFKIYKNTSEQLEEALGRESGVNSTIQDDLSNYARCMIDDVHNLTNTTLPFSVlGAYQGIIQDAVAGNAn'GLKTPDQAINQTVIK^ 

FYGFTDKAGRKWRADSYARTVINTTTWRVFNEAKEAPAREFGIDTFYYSKKATAREMCAPLQHQIVT^ 

EPDGCLGINCKHTKTPFWGVNSKPELPEHLKNITPAQAKANANAQAKQRAIERSIRKSKELLHVAKQLGDKELIRQYQSDVRSKQDA 
LNYLINNNAFLHRNCMREKRYNNPYTKTQSEVEVRKEKAKLDKRRDVESAllGVETSEGIPLKITKHLAERAVLRNIAPIDIVDSIKEP^^^ 
APIKYDNLDRPSQKYIGKCVSTVINPIDGNIVTVHATSTRIRKKYGGN 

SPy0702 
Seq ID 181 

MSRDPTLILDESNLVIGKDGRVHYTFTTEDDNPKVRU^SKCLGTAHFNQLMIERGDQATSYVAPVWEGTGNPTGLFKDLK^ 

TANSQLWSKIKLTNRGMLQEYYDGKIKTEIVNSARGVATRISEDTDKKLALINDTIDGIRREYRDADRKLSASYQAGIEGLKATMANDKI 

GLQAEIKASAQGLSQKYDDELRKLSAKITTTSSGTTEAYESKLAGLRAEFTRSNQGTRTELESQISGLRAVQQSTASQISQEIRDREGA 

VSRVQQSLESYQRRMQDAEENYSSLTHTVRGLQSDVGSPTGKIQSRLTQLAGQIEQRVTRDGVMSIISGAGDSIKLAIQKAGGINAKM 

SGNEIISAINLNSYGVTIAGKHlALDGNTTVNGTFTTKIAEAIKIRADQnAGTIDAARIRVINLNASSIVGLDANFIKAKlGYAlTDLLEGKVIK 

ARNGAMLIDLNTAKMDFNSDATINFNSKNNALVRKDGTHTAFVHFSNATPKGYTGSALYASIGITSSGDGVNSASSGRFAGLRSFRYA 

TGYNHTAAVDQTEIYGDNVLWDDFNITRGFKFRPDKMQKMLDMNDLYAAWALGRCWGHLANVGWNTAHSNFTSAVNRELNNYIT 

Kl 

SPy0710 
Seq ID 182 

MTFLDKIKQGCLDGWAKYKILPSLTAAO^ILESGWGKHAPHNALFGIKADSSVVTGKSFDTKTQEEYQAGVWD 

ADHGQFLVDNPRYEAVIGETDYKKACYAIKAAGYATASSYVELLIQLIEENDLQSWDREALKNNKEETMTTANEIVQYCVNLANSGMG 
VDKDGAHGTQCCDLPCFVAKNWFGVDLWGNAIDLLDSASAQGWEVHRMPTEANPKAGATFVQSVPYHQFGHTGIVIEDSDGYTMR 
TVEQNIDGNPDALYVGAPARFNTRDFTGVIGWFYPPYQGDTNTTQPVSTEPQTSDTIVETAKTGTFTLDVAEINIRRWPSLASEWGIYK 
QGDTVSFDSEGYANGYYWISYVGGSGMRNYLGIGQTDKDGNRISLWGKLN 

SPy0711 
Seq ID 183 

MKKINIIKIVFIITVILISTISPIIKSDSKKDISNVKSDLLYAYTITPYDYKNCRVNFSTTHTLNIDTQKYRGKDYYISSEIVISYEASQKFKRDDH 
VDVFGLFYILNSHTGEYIYGGITPAQNNKVNHKLLGNLFISGESQQNLNNKMLEKDIVTFQEIDFKIRKYLMDNYKIYDATSPYVSGRIEIG 
TKDGKHEQIDLFDSPNEGTRSDIFAKYKDNRIINMKNFSHFDIYLEK 

SPy0720 
Seq ID 184 

MITTFETILDKIKAHQTIIIHRHQNPDPDALG8QAGLKEIIAQNFPDKKVLMTGFDEPSLJ^WISQMDQ^n■DKDYKEALVIITO^ 
ERYTLGKCLIKIDHHPNDDWGDFYYVDTSASSASEIIADFAFSQNLTLSDKAAKLLYTGIVGDTGRFLYASTTSKTLSIASQLRHFEFDF 
AAISRQMDSFPLKIAKLQSYVFEHLTIDESGAAYVLVSQETLKHFDVTLj^ESSAIVCAPGKIDNVQAWAIFVELTDGNYRVRM 
NGIAKRHGGGGHPLASGANSANLEENQAIFRELIAVCQEI 

SPy0727 
Seq ID 185 

MIEENI<HFEKKMQEYDASQ1QVLEGLEAVRMRPGMYIGSTAKEGLHHL\A/VE1VDNSIDEALAGFASHIKVFIEADNS1TWDDGRG1PV 

DIQAKTGRPAVEWFTVLHAGGKFGGGGYKVSGGLHGVGSSWNALSTQLDVRWKNGQIHYQEFKRGAWADLEVIGTTDVTGTTV 

HFTPDPEIFTETTQFDYSVLAKRIQELAFLNRGLKISITDKRSGMEQEEHFLYEGGIGSYVEFLNDKKDVIFETPIYTDGELEGIAVEVAM 

QYTTSYQETVMSFANNIHTHEGGTHEQGFRAALTRVINDYAKKNKILKENEDNLTGEDVREGLTAVISVKHPNPQFEGQTKrKLGNSE 

WKITNRLFSEAFQRFLLENPQVARKIVEKGILASKARIAAKRAREVTRKKSGLEISNLPGKLADCSSNDANQNELFIVEGDSAGGSAKS 

grnrefqailpirgkilnvekatmdkilaneeirslftamgtgfgadfdvskaryqklvimtdadvdgahirtllltliyrfmrpvleag 

ywiaqppiygvkvgseikeyiqpgidqedqlktalekysigrskptvqrykglgemddhqlwettmdpenrlmarvtvddaaeadkv 

fdmlmgdrveprrdfieenavystldi 
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SPy0737 
Seq ID 186 

MRKVKKVFVSSCMLLTVGLGVAVPTGFSQSNGVMWKAAEVPATDLSRQASDSERVDESSLLQKENLSVDSFKLENLNGWEAENDT 

AGNLGKFKDPDSSGYQNILTSSGKNISVAVAPKGSGKMNIKVTKRSNFQGGYYVGGLRTQTPVLKLNDVYRYSFTTKKLSGNSSEFK 

TRVKPVESNNKLGKELVIRVDNKNVSTKHDWLPDISDGTHTVDFTGLDKKLSVAFRFSPRQTSNVVYEFSNINIKNISPASVPAIPSKVL 

EGTSVLSGTAISSGDTLEKRKSFDGDILRVYKDSKIIARTVIKGNKWDVKLSKPLIAGEKLDFEILHPRSQNVSKKISKQVEAKPFDPASY 

KEKVIAKLKPVYEATSEKITNDAWLDENAKDLQKQKLEEQYISGKVAISEAGTKQEAIDAAYNKYSSQTDPDSLPSQYKQGNKENEQE 

KGRQDLIQTRDLTLKAIQEDKWLTEQEKTIQKEEALKAFETGIESVNQWSLEQLKQRLIWKASEKDSEKKEYPESIPNQHIPGK^ 

KAAKQEELKKLHDTTLEKINQDKWLTPDQQAEQLKQAEWFKKGQEAIKSAQTLTQLETDLADYVSENEGKGN 

VNKAEVKLKEAHEATKQA!EKDPWLSPEQKKAQKEKAKARLDEGLI<ALI<AADSLEILK\n"EEAFVDK^ 

QALDSLDKEVQKELESIDNDNTLTTDEKAAAKKKVNDAYDVAKQTAMEANSYEDLTTIKDEFLSNLPHKQGTPLKDQQSDAIAELEK^ 
QQEIEKAIEGDKTLPRDEKEKQIADSKERLKSDTQKVKDAKNADA1KI<AFEEGKVNIPQAHIPGDLNKDKEKLIJ^^ 
DKTLTEDEKKEQKVKTKAELEKAKTDVKNTQTREELDKKVPELKKAIEDTHVKGNLEGVKNKAIEDLKKAHTEWAKINGDDT 
EAQVKEADKALA^GKDAIT!<ADDADKVSTAWEHTPKIKAAHKTGDLKKAQVDANTALDKA^ 

TALTKAKDNVKAAKTADA(NDARDKGVAT1DAVHKAGQDLGARKSGQVAKLEEAAI<ATKDKISADPTLTSKEKEEQSKAVDAELKK^ 

AVNAADTADKVDDALGEGWDIKNQHKSGDSIDARREAHGKELDRVAQETKGAIEKDPTLTTEEKAKQVKDVDAAKERGMA 

DADALDKAYGEGWDIKNQHKSGDPVDARRGLHNKS1DEVAQATKDA1TADTTLTEAEKETQRGNVDKEATKAKEELA1<AKDADALD 

KAYGDGVTSIKNQHKSGKGLDVRKDEHKKALEAVAKR\n"AEIEADPTLTPEVREQQl<.^EVQKELELATDKIAEAKDADEADl<^^ 

VTAlENAHVIGKGlEARKDLAKKDb^EAAAKTKALllEDKTLTDDQRKEQLLGVDTEYAKGIENIDAAKDAAGVDI<AYSDGVRDIL^ 

GQNLNDRRNAAKEFLLKEADKVTKLINDDPTLTHDQKVDQINKVEQAKLDAIKSVDDAQTADAINDALGKGIENINNQYQHGDGVDVR 

KATAKGDLEKEAAKVKAL1AKDPTLTQADKDKQTAAVDAAKNTAIAAVDKATTTEGINQELGKGITA1NI<^^ 

EKEAAKVKALITNDPTLTKADKAKQTEAVAKALKAAIAAVDKATTAEGINQELGKGITAINKAYRPGE 

EAiANDPTLTKADKAKQTEAVAKALKAAlAAVDKATTAEGINQELGKGITAINI<AYRPGEGVEAHKEAA^^ 

SETEKAVQKQAVEQALAKALGQVEAAKWEAVKLAENLGTVAIRSAYVAGUitKDTDQATAALNEAKCW\lEALKC^^ 

LTEAQKAEQSENVSlJ^LKTAIATVRSAQSIASVKEAKDKGlTAIRAAYVPNKAVAKSSSANHLPKSGDANSIVLVGLGVMSLLLGM^^ 

KKKESKD 

SPy0747 
Seq ID 187 

MINKKCIIPVSLLTLAITLTSVEEWSRQNLTYANEIWQRPKRESVISDKSNFPVISPYLASVDFGERKTPLPTPDKGVKVTTEQSIAQVR 

KGPEERPYTVTGKITSVINGWGGYGFYIQDSEGIGLYWPQKDLGYSKGDIVQLTGTLTRFKGDLQLQQVTAHKKLELSFPTSVKEAVI 

SELETTTPSTLVKLSHWVGELSTDQYNNTSFLVRDDSGKSIWHIDHRTGVKGADVVTKISQGDLINLTAILSIVDGQLQLRPFSLEQLE 

WKKVTSSNSDASSRNIVKIGEIQGASHTSPLLKKAVWEQNAA/TYLDDSTHFYVQDLNGDGDLATSDGIRVFAKNAKVQVGDVLTIS 

EVEEFFGRGYEERKQTDLTITQIVAKAVTKTGTAQVPSPLVLGKDRIAPANIIDNDGLRVFDPEEDAIDYWESMEGMLVAVDDAKILGP 

MKNKEIYVLPGSSTRPLNNSGGVLLPANSYNTDVIPVLFKKGKQIIKAGDSYKGRLAGPVSYSYGNYKVFVDDSKNMPSLMDGHLKPE 

KTNLQKDLSKLSIASYNIENFSANPSSTKDEKVKRlAESFIHDLNAPDIiGLIEVQDNNGPTDDGTTDATQSAQRLIDAIKKLGGPTYRYV 

DIAPENNVDGGQPGGNIRTGFLYQPERVSLSDKPKGGARDALTWVNGELNLSVGRIDPTNAAWKDVRKSLJ^AEF!FQGRKVVW 

HLNSKRGDNALYGCVQPWFKSEQRRHVLANMU^QFAKEGAKHQANIVMLGDFNDFEFTKTIQLIEEGDMVNLVSRHDISDRYSYFH 

QGNNQTLDNILVSRHLLDHYEFDMVHVNSPFMEAHGRASDHDPLLLQLSFSKENDKAESSKQSVKAKKTSKGKLLPKTGDSL^ 

LGTASLLVPILLLTKGKKES 

SPy0777 
Seq ID 188 

VISFAPFLSPEAIKHLQENERCRDQSQKRTAQQIEAIYTSGQNILVSASAGSGKTFVMVERILDKILRGVSIDRLFISTFTVKAATELRERI 

ENKLYSQlAQTTDFQMKWLTEQLQSLCQADIGTMDAFAQKWSRYGYSiGISSQFRIMQDKAEQDVLKQEVFSKLFNEFMNQKEAPV 

FRALVKNFSGNCKDTSAFRELWTCYSFSQSTENPKIWLQENFLSAAKTYQRLEDIPDHDIELLLLAMQDTANQLRDVTDMEDYGQLT 

KAGSRSAKYTKHLTMEKLSDWVRDFKCLYGKAGLDRLIRDWGLIPSGNDVTVSKVKYPVFKTLHQKLKQFRHLETILMYQKDGFSLLE 

QLQDFVLAFSEAYLAVKIQESAFEFSDIAHFAIKILEENTDIRQSYQQHYHEVMVDEYQDNNHMQERLLTLLSNGHNRFMVGDIKQSIY 

RFRQADPQIFNQKFRDYQKKPEQGKVILLKENFRSQSEVLNVSNAVFSHLMDESVGDVLYDEQHQLIAGSHAQTVPYLDRRAQLLLY 

NSDKDDGNAPSDSEGISFSEVTIVAKEIIKLHNDKGVPFEDITLLVSSRTRNDIISHTFNQYGIPIATDGGQQNYLKSVEVMVMLDTLRTI 

NNPRNDYALVALLRSPMFAFDEDDLARIALQKDNELDKDCLYDKIQRAVIGRGAHPELIHDTLLGKLNVFLKTLKSWRRYAKLGSLYDL 

IWKIFNDRFYFDFVASCMKAEQAQANLYAL7\.LRANQFEKSGYKGLYRFIKMIDKVLETQNDIJ\DVEVATPKQAVNLMTIHKS 

VFILNCDKRFSMTDIHKSF1LNRQHGIGIKYLAD!KGLLGETTLNSVKVSMETLPYQLNKQELRU\TLSEEMRLLYVAMTRAEKKW 

ASKSKSQEITDPKKLGKLLPLALREQLLTFQDWLLAIADIFSTEDLYFDVRFIEDSDLTQESVGRLQTPQLLNPDDLKDNRQSETIARAL 

DMLEAVSQLNANYEAAIHLPTVRTPSQLKATYEPLLEPIGVDIIEKSSRSLSDFTLPHFSKKAKVEASHIGSALHQLMQVLPLSKPINQQ 

TLLDALRGIDSNEEVKTALDLKKIESFFCDTSLGQFFQTYQKHLYREAPFAILKLDPISQEEYVLRGIIDAYFLFDDHIVLVDYKTDKYKQP 

lELKKRYQQQLELYAEALTQTYKLPVTKRYLVLMGGGKPEIVEV 

SPy0789 
Seq ID 189 

MVKTDFKLRYQGSAlGYLWSILKPLMMFTIMYLVFlRFLRLGGNVPHFPVALLlJ^NViWSFFSEATSMGMVSIVSRGDLLRKLNFSKHII 

VFSAVLGALINFLINLNAA/LIFALINGWISGYAYLSLFLFIELWLVLGIALLLSNVFNA^RDLAQVWEVLLQAG 

AAKLLMLNPVAQMlQDFRYLLIDRANVriWQMSTNWFYIVIPYLVPFVlLFIGIFVFKKNADRFAEII 

SPy0839 
Seq ID 190 

MTFLSDLlSLMTKIRLSVVViKAGIFQLLF\nriANIVLSEFFYFILDWGQYHLDKDNVWFLKNPIALALLGAYL 

QEISFYLFRKQFSYYLRGLWKTFSGYQLLLFLLYILLTIPVLHIGLSSViTQKLYLPEFIVGELSKITSTKYLLYGSLILVFYLNLRLVYFLPLI 
AINHRTVAOfl^WRESWQKTKKKHVLLWMKLFAINGLTIWLSU^lSMILlFVDMFNPKGNNIIVQLGALTFTO^ 

AIEPQKQYDEPRRSNKAYWIFIVVTVGFAYQSLERLTFFDTSHSKTVIAHRGLVSAGVENSLEALEGAKKAGSDYVELDLILTKDNHFV 
VSHDNRLKRLAGVNKTIRNLTLKEVEHLTSHQGHFSGRFVSFDTFYQKAKKLNMPLLIELKPIGTEPGNYVDLFLETYHRLGISKDNKV 
MSLDLEVIEA!KKKNPSITTGY!IPIQFGFFGDEFVDFYVIEDFSYRSYLSSQAFWNNKEIY\AAn*lNDPKRlEHYLLKPIQGIlTDQP 
LIKDLKQDNSYFSRLVRIISSLY 

SPy0843 
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Seq ID 191 

MKKHLKTVALTLTTVSXAA-HNQEVFSLVKEPlLKQTCy^SSSISGADYAESSGKSKLKlNETSGPVDDTXn-DLFSDKRTTPEKIKDNL^^ 

PREQELKA\n'ENTESEKQITSGSQLEQSKESLSLNKTVPSTSNWEiCDFlTKGNTLVGLSKSGVEKLSQTDHLVLPSQAADGTQLlQVA 

SFAFTPDKKTAIAEYTSRAGENGEISQLDVDGKEIINEGEVFNSYLLKKVTIPTGYKHIGQDAFVDNKNIAEVNLPESLETISDYAFAHLA 

LKQIDLPDNLKAIGEIJ\FFDNQ1TGKLSLPRQLMRLAERAFKSNHIKTIEFRGNSLKV1GEASFQDNDLSQLMLPDGLEKIESEAFTGNP 

GDDHYNNRWLWTKSGKNPSGLATENTYVNPDKSLWQESPEIDYTKWLEEDFTYQKNSXn-GFSNKGLQKVKRNKNLEIPKQHNGVT 

iTEIGDNAFRNVDFQNKTLRKYDLEEVKLPSTIRKlGAFAFQSNNLKSFEASDDLEEIKEGAFMNNRIETLELKDKLVriGDAAFHiNHIYA 

IVLPESVQEIGRSAFRQNGANNLIFMGSKVKTLGEMAFLSNRLEHLDLSEQKQLTEIPVQAFSDNALKEVLLPASLKTIREEAFKKNHLK 

QLEVASALSHIAFNALDDNDGDEQFDNKXAA/KTHHNSYALADGEHFIVDPDKLSSTIVDLEKILKLIEGLDYSTLRQTTQTQFRDMTTA 

G1<ALLSKSNLRQGEKQKFLQEAQFFLGRVDLDKAIAKAEKALVTKKATKNGQLLERSINKAVLAYNNSAIKKAN 

EGKGPLAQATMVQGWLLKTPLPLPEYYIGLNWFDKSGKLIYALDMSDTIGEGQKDAYGNPILNVDEDNEGYHAL^VATI-ADYEGLDI 

KT1LNSKLSQLTS1RQVPTAAYHRAG1FQA1QN/W\EAEQLLPKPGTHSEKSSSSESANSKDRGLQSNPKTNRGRHSAILPRT^ 

VYGILGYTSVALLSLITAIKKKKY 

SPy0872 
Seq 10 192 

MKKYFILKSSVLSILTSFTLLWDVQADQVDVQFLGVNDFHGALDNTGTAYTPSGKIPNAGTAAQLGAYMDDAEIDFKQANQDGTSIRV 
QAGDMVGASPANSALLQDEPTVKVFNKMKFEYGTLGNHEFDEGLDEFNRIMTGQAPDPESTINDITKQYEHEASHQTIVIANVIDKKT 

KDIPYGWKPYAIKDIAINDKIVKIGFIGWTTEIPNLVLKQNYEHYQFLDVAETIAKYAKELQEQHVHAIWLAHVPATSKDGWDHEMAT 

VMEKVNQIYPEHSIDIIFAGHNHQYTNGTIGKTRIVQALSQGKAYADVRGTLDTDTNDFIKTPSANWAVAPGIKTENSDIKAIINHANDIV 

KTX/TERKIGTATNSSTISKTENIDKESPVGNU^TTAQLTIAKKTFPTVDFAMTNNGGIRSDLWKNDRTITWGAAQAVQPFGNILQV^ 

TGQHIYDVLNQQYDENQTYFLQMSGLTYTYTDNDPKNSDTPFKIVKWKDNGEEINLTTTYTNAA/NDFLYGGGDGFSAFKKAKLIGA^ 

TDTEAFITYlTNLEASGKTVNATIKGVKNY\n-SNLESSTKVNSAGKHSlISKVFRNRDGNTVSSEVISDLLTSTENTNNSLGKKETTTNKN 

TISSSTLPITGDNYKMSPIMTILALISLGGLNAFIKKRKS 

SPy0895 
Seq ID 193 

MTNNQTLDILLDWAYNHAFRIAKALPNIPKTALYLLEMLKERRELNLAFLAEHAAENRTIEDQYHCSLWLNQSLEDEQIANYILDLEVKV 

KNGAIIDFVRSVSPILYRLFLRLITSEIPNFKAYIFDTKNDQYDTWHFQAMLESDHEVFKAYLSQKQSRNVTTKSLADMLTLTSLPQEI 

LVFLLRHFEKAVRNPLAHLIKPFDEEEUHRTTHFSSQAFLENIITLATFSGVIYRREPFYFDDMNAIIKKELSLWRQSIV 

SPy0972 
Seq ID 194 

MKTTSUKVDLPSTIGIGYGAFWRSRNFYRWKGSRGSKKSKTTALNFIVRLLKYPWANLLVIRRYSNTNKQSTYTDFKWACNQLKW 
HLFKFNESLPEITVKATGQKILFRGLDDELKITSITVDVGALCWAWFEEAYQIETEDKFSTWESIRGSLDAPDFFKQITVTFNPWSERH 
WLKRVFFDEETKRADTFSGTTTFRVNEWLDDVDKRRYEDLYKTNPRRARIVCDGEWGVAEGLVFDNFEWDFDVEKTIQRVKETSA 
GMDFGFTQDPTTLICVAVDLANKELWLYNEHYQKAMLTDHIVKMIRDKNLHRSYIAGDSAEKRLIAEIKSKGVSGIVPSIKGKGSIMQGI 
QFMQGFKIYIHPSCEHTIEEFNTYTFKQDKEGNWLNEPIDKNNHVIDAIRYALEKYHIRSNESNQFEVLRAGFGY 

SPy0981 
Seq ID 195 

MAEETQTVETVEEQWPEAKQPQDEKKYTDADVDAIIDKKFAKWKSEQEAEKSEAKKMAKMNEKEKADYEKQKLLDELQELKNDK^ 

RNELTAVARQMFAESEINVNDDVLGLNA/TLDAEQTKAN\n'TlJ\NAFAKVIADDRKALVRQTTPSTGGGLSKQTNYGANLASI^ 

TKLF 

SPy1008 
Seq ID 196 

MRYNCRYSHlDKKIYSMIICLSFLLYSNWQANSYNTTNRHNLESLYKHDSNLlEADSIKNSPDI\n*SHMLKYSVKDKNLSVFFEKDW 

QEFKDKEVDIYALSAQEVCECPGKRYEAFGGITLTNSEKKEIKVPVNXAA/DKSKQQPPMFIWNKPKVTAQEVDIKVRKLLIKKYDIYNN 

REQKYSKGTVTLDLNSGKDIVFDLYYFGNGDFNSMLKIYSNNERIDSTQFHVDVSIS 

SPy1032 
Seq ID 197 

VNTYFCTHHKQLLLYSNLFLSFAMMGQGTAIYADTLTSNSEPNNTYFQTQTLTTTDSEKKWQPQQKDYYTELLDQWNSIIAGNDAYD 

KTNPDM\n"FHNKAEKDAQNIIKSYQGPDHENRTYLWEHAKDYSASANlTKTYRNIEKIAKQlTNPESCYYQDSKAlAIVKDGMAFMYEH 

AYNLDRENHQTTGKENKENWWVYEIGTPRAlNNTLSLMYPYFTQEEILKYTAPIEKFVPDPTRFRVRAANFSPFEANSGNLiDMGRVK 

LlSGILRKDDLEISDTIKAlEKVFTLVDEGNGFYQDGSLIDHXArrNAQSPLYKKGIAYTGAYGNVLlDGLSQLlPllQKTKSPIKADKMAT 

WINHSFFPIIVRGEMMDMTRGRSISRFNAQSHVAG1EALRA1LR1ADMSEEPHRLALKTRU<TLWQGNAFYNWDNLKTYHDIKLM 

LSDTSVPVQKLDSYVASFNSMDKLALYNNKHDFAFGLSMFSNRTQNYEAMNNENLHGWFTSDGMFYLYNNDLGHYSENYWATVNP 

YRLPGTTETEQKPLEGTPENIKTNYQQVGMTGLSDDAFVASKKLNNTSALAAMTFTNWNKSLTLNKGWFILGNKIIFVGSNIKNQSSH 

KAYTTIEQRKENQKYPYGSYVNNQPVDLNNQLVDFTNTKSIFLESDDPAQNIGYYFFKPTTLSISKALQTGKWQNIKADDKSPEAIKEV 

SNTFITIMQNHTQDGDRYAYMMLPNMTRQEFETYISKLDIDLLENNDKUV\WDHDSQQMHVIHYGKKATMFSNHNLSHQGFYSFPH 

PVRQNQQ 

SPy1054 
Seq ID 198 

LLTFGGASAVKAEENEKVREQEKLIQQLSEKLVEINDLQTLNGDKESIQSLVDYLTRRGKLEEEWMEYLNSGIQRKLFVGPKGPAGEK 
GEQGPTGKQGERGETGPAGPRGDKGETGDKGAQGPVGPAGKDGQNGKDGLPGKDGKDGQNGKDGLPGKDGKDGQDGKDGLP 
GKDGKDGQNGKDGLPGKDGQPGKPAPKTPEWQNPDTAPHTPKTPRlPGQSKDVTPAPQNPSNRGLNKPQTQGGNQLAKTP/SiAH 
DTHRQLPATGETTNPFFTA/y^VAIMTTAGWAVAKRQENN 

SPy1063 
Seq ID 199 

MYIFSSSKKDSAKELVILTPNSQTILTGTIPAFEEKYGVKVRLIQGGTGQUDQLGRKDKPLNADIFFGGNYTQFESHKDLFESYVSPQV 
STVISDYQLPSHRATPYTlNGSVLIVNNELJ^RGLHITSYEDLLQPALKGKIAFADPNSSSSAFSQLTNlLlJ^KGGYTNADAWAYMKRLL^ 
NMNSIRATSSSEWQSVAEGKMIVGLTYEDPCINLQKSGANVSIWPKEGTVFVPSSVAIIKHAPNMTEAKLFINFMLSRDVQNAFGQS 



wo 2004/078907 



37/45 



PCT/EP2004/002087 



TSNRPIRQDAQTSHDMKALETIATLKEDYAYVTKHKKKIVATYNQLRQRLEKAK 

SPy1162 
Seq ID 200 

MPTSIKAIKESLEAVTSLLDPLFQELATDTRSGVQKALKSRQKVIQAELAEEERLEAMLSYEKALYKKGYKAIAGIDEVGRGPL^^ 

ACVlLPKYCKlKGLNDSKKIPKAKHETIYQAVKEKALAIGlGIIDNQLIDEVNiYEATKU^MLEAlKQLEGQLTQPDYLLIDA 

LKGDANSLSI/KAASIVAKVTRDQMMANYDRIFPGYDFAKNAGYGTKEHLQGLKAYGITPIHRKSFEPVKSMCCDSTNP 

SPy1206 
Seq ID 201 

MWKEETMSILEVKQLSHGFGDRAIFENVSFRLLKGEHIGLVGANGEGKSTFMSIVTGHLQPDEGKVEWSKYVTAGYLDQHTVLESG 

QWRDVLRTAFDELFKTENRlNElYASMADDKADIAVLMEEVGELQDRLESRDFYTLDAKIDEVARALGVMDFGMESDVrSLSGGQRT 

KVLLAKLLLEKPDILLLDEPTNHLDAEHIEWLKRYLQHYENAFVLISHDISFLNDVINIWHVENQSLVRYTGDYYQFQAWEMK^^ 

AAYERQQKEIANLQDFVNRNKARVATRNMAMSRQKKLDKMDIIELOO^EKPKPNFEFKQARTPSRFIFQTKNLVIGYDYPLTKEPLN^ 

ERNQKlAlVGANGIGKSTLLKSLLGVIEPLEGHINn-GDFLEVGYFEQEWGVNRQTPLENAWDAFPALNQAEVRAALj^R^ 

QIQVLSGGEQAKVRFCLLMNRENNVLILDEPTNHLDIDAKNELKRALKAYKGSILMVCHEPDFYNGWVTDTWDFSKLT 

SPy1228 

Seq ID 202 

MNKKFIGLGLASVAVLSIJ^CGNRGASKGGASGKTDLKVAMWDTGGVDDKSFNQSAWEGLQSWGKEMGLQKGTGFDYFQSTS 
SEYATNLDTAVSGGYQLIYGIGFALKDAIAKAAGDNEGVKFVIIDDIIEGKDNVASWFADHEAAYLAGIAAAKTTKTKTVGFVGGMEGT 
VITRFEKGFEAGVKSVDDTIQVKVDYAGSFGDAAKGKTIAAAQYAAGADVIYQAAGGTGAGVFNEAKAINEKRSEADKVWVIGVDRD 
QKDEGKYTSKDGKEANFVIJ\SSIKEVG1<AVQLINKQVADKKFPGGKTTWGLKDGGVE1ATTNVSKEAVKAIKEAKAK1KSG 

SPy1245 
Seq ID 203 

MKMKKKFFLLSLLALSTFFLSACSSWIDKGESITAVGSTALQPLVEAVADEFGSSNLGKTVNVQGGGSGTGLSQVQSGAVQIGNSDV 
FAEEKDGIDASKLVDHQVAVAGLAVIANPKVKVSNLSSQQLQKIFSGEYTNWKQVGGEDLAISVINRAASSGSRATFDSVIMKGVNAK 
QSQEQDSNGMVKSIVSQTPGAISYLSFAYVDSSVKSLQLNGFKANAKNVATNDWPIWSYEHMYTKDKPTGLTKEFLDYMFSDEVQQ 
NIVTHMGYISINDMEWKSHDGKVTKR 

SPy1315 
Seq ID 204 

MTHKIKVLLLAIMSIFLTCNIASAETIAIVSDTAYAPFEFKDSDQIYKGIDVDIINEVAKRQSWDFSMSFPGFDAAVNAVQSGQASALMAG 

TTITNARKKVFHFSEPYYDTKlVIATRKANAIKKYSDLKGKTVGVKNGTAAQAFLNNYKKKYDYTVKTFDTGDLMYNSLSAGSi/\AVMD 

DEAVIQYAISQNQDIAINMKGEPIGSFGFAVKKGSGYDYLVNDFNTALKAMKADGTYQAIMTKWLGTDDKATTSQATGNPSAKATPT^ 

DSYKIVSDSSFAPFEFQNGKGKYVGIDIELIKAIAKQQGFKIEIANPGFDAALNAVQSSQADGVIAGATITDARKAIFDFSDPYYTSNIILA 

VKAGKNIKNYEDLDRKTVGAKNGTSSYSWLKENAPKYGYNVKAFDDGSSMYDSLNSGSVDAIMDDEAVLKYAISQGRRFETPLEGIS 

TGEVGFAVKKGTNPELIEMFNNGLAALKKSGQYDDIIDKYLDSKKAATPSEKGADESTISGLLSNNYKQLUVGLGTTLSLTLI^ 

IFGMMAVSPTKSLRLISWFVDWRGIPLMIVAAFIFWGVPNLIESMTGHQSPINDFUSATIALSLNGGAY 

RSLGLSYGTTMRKVILPQAVKLMLPNFINQFVISLKDTTIVSAIGLVELFQTGKlllARNYQSFRMYAIUSiHYUMULLTRL^^ 

SPy1357 
Seq ID 205 

MGKEIKVKCFLRRSAFGLVAVSASXA-VGSWSAVDSPIEQPRIIPNGGTLTNLLGNAPEKLALRNEERAIDELKKQAIEDKEATTAI^ 

SDALEAU^DQTDALQSEEAAWKADN/\ASDALEALADQTDALQSEEAEWQSDN/\ASDAWEK/^ATPIALD^^ 

NVNTLPTTGEESNPFFT/WSlLAIMVSTGVLWSSKCKEN 

SPy1361 
Seq ID 206 

MKTKKVIILVGLLLSSQLTLIACQSRGNGTYPIKTKQSRKGMTSNKIKPIKKSKKTNKTHKGVAGVDFPTDDGFILTKDSKILSKTDQGIV 

VDHDGHSHFIFYADLKGSPFEYLIPKGASLAKPAVAQRAASQGTSKVADPHHHYEFNPADIVAEDALGYTVRHDDHFHYILKSSLSGQ 

TQAQAKQVATRLPQTSSLVSTATANGIPGLHFPTSDGFQFNGQGIVGVTKDSILVDHDGHLHPISFADLRQGGWAHVADQYDPAKKA 

EKPAETHQTPELSEREKEYQEKLAYLAEKLGIDPSTIKRVETQDGKLGLEYPHHDHAHVLMLSDIEIGKDIPDPHAIEHARELEKHKVG 

MDTLRALGFDEEVILDIVRTHDAPTPFPSNEKDPNMMKEWLATVIKLDLGSRKDPLQRKGLSLLPNLETLGIGFTPIKDISPVLQFKKLK 

QLLMTKTGVTDYRFLDNMPQLEGIDISQNNLKDISFLSKYKNLTLVAAADNGIEDIRPLGQLPNLKFLVLSNNKISDLSPLASLHQLQELH 

IDNNQITDLSPVSHKESLTWDLSRNADVDLATLQAPKLETLMVNDTKVSHLDFLKNNPNLSSLSINRAQLQSLEGIEASSVIVRVEAEG 

NQ1KSLVLKDKQGSLTFLDWGNQLTSLEGVNNFTALDILSVSKNQLTNVNLSKPNKTVTNID1SHNNISU\DLKLNEQHIPEAIAKNFPA 

WEGSMVGNGTAEEKAAMATKAKESAQEASESHDYNHNHTYEDEEGHAHEHRDKDDHDHEHEDENEAKDEQNHAD 

SPy1371 
Seq ID 207 

LAKQYKNLVNGEWKLSENEITIYAPATGEELGSVPAMTQAEVDAWASAKI<ALSDWRALSYVERAAYLHKAADILVRDAEKIGAILSKE 

VAKGHKAAVSEVIRTAEIINYAAEEGLRMEGEVLEGGSFEAASKKKIAIVRREPVGLVLAISPFNYPVNLAGSKIAPALIAGNWAL^^ 

QGSISGLLUKEAFAEAGIPAGVFNTITGRGSVIGDYIVEHEAVSFINFTGSTPIGEG1GKIJ\GMRPIMLELGGKDSAIVLEDADU\ 

VAGAFGYSGQRCTAVKR\A.VMDKVADQL«VAEIKTLVEKLSVGMPEDDAD1TPLIDTSAADFVEGLIKDA 

VLFDH\nTDMRUKWEEPFGPVLPIIR\m'VEEAIKISNESEYGLQASlFTTNFPKAFGIAEQLEVGTVHLNNKTQRGT 

AGVQGVKYSIEAMTTVKSWFDIQ 

SPy1375 
Seq ID 208 

MSLKDLGDISYFRLNNEINRPVNGKIPLHKDKEALKAFSAENVLPNTMSFTSITEKIEYUSNDYIESAFIQKYRPEFITELDSIIKSENFRF 

KSFMAAYKFYQQYALKTNDGEHYLENLEDRVLFNALYFADGQEDLAKDLAVEMINQRYQPATPSFLNAGRSRRGELVSCFLIQVTDD 

MNSIGRSINSALQLSRIGGGVGITLSNLREAGAPIKGYAGAASGWPVMKLFEDSFSYSNQLGQRQGAGWYLNVFHPDIIAFLSTKKE 

NADEKVRVKTLSLGITVPDKFYEU^RKNEDMYLFSPYNVEKEYGIPFNYLDITNMYDELVANPKITKTKIKARDLETEISKLQQESGYPYI 

INIDTANKANPIDGKIIMSNLCSEILQVQTPSLINDAQEFVEMGTDISCNLGSTNILNMMTSPDFGRSIKTMTRALTFVTDSSSIEAVPTIK 
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HGNSQAHTFGLGAMGLHSYLAQHHIEYGSPESIEFTDIYFMLLNYVVTLVESNNIARERQTTFVGFENSKYANGSYFDKYVTGHFV^ 

DLVKDLFKDHFIPQASDWEALRDAVQKDGLYHQNRLAVAPNGSISYjNDCSASlHPITQRIEERQEKKIGKIYYPANGLSTDTlPYYTSA 

YDMDMRKVIDWAAATEHVDQGLSLTLFLRSELPMELYEWKTQSKQTTRDLSILRNYAFNKGIKSIYYIRTFTDDGEEVGANQCESCVI 

SPy1389 
Seq ID 209 

MKELSSAQIRQMWLDR/VKSKGHCVEPSANLVPVNDPTLLWINSGVATLKKYFDGSVIPENPRITNAQKSIRTNDIENVGKTARHHTMF 

EMLGNFSIGDYFRDEAIEWGFELLTSPDWFDFPKDKLYMTYYPDDKDSYNRWIACGVEPSHLVPIEDNFWEIGAGPSGPDTEIFFDR 

GEDFDPENIGLRLLj^EDIENDRYlEIWNlVLSQFNADPAVPRSEYKELPNKNIDTGAGLERLAAVMQGAKTNFETDLFMPIIREVEKLSG 

KTYDPDGDNMSFKVIADHIRALSFAIGDGALPGNEGRGYVLRRLLRRAVMHGRRLGINETFLYKLVPWGQIMESYYPEVLEKRDFIEK 

IVKREEETFARTIDAGSGHLDSLLAQLKAEGKDTLEGKDIFKLYDTYGFPVELTEELAEDAGYKIDHEGFKSAMKEQQDRARAAWKG 

GSMGMQNETLAGIVEESRFEYDTYSLESSLSVIiADNERTEAVSEGO^^LLVFAQTPFYAEMGGQVADTGRlKNDKGDTVAEWDVQK 

APNGQPLHTVNVLASLSVGTNYTLEINKERRLAVEKNHTATHLLHAALHNVIGEHATQAGSLNEEEFLRFDFTHFEAVSNEELRHIEQE 

VNEQIWNALTITTTETDVETAKEMGAMALFGEKYGKWRWQIGNYSVELCGGTHLNNSSEIGLFKIVKEEGIGSGTRRIIAVTGRQAF 

EAYRNQEDALKEIAAWKAPQLKDAAAKVQALSDSLRDLQKENAELKEKAAAAAAGDVFKDVQEAKGVRFIASQVDVADAGALRTFA 

DNWKQKDYSDVLVLVAAIGEKVNVLVASKTKDVHAGNMIKElJ^PIVAGRGGGKPDMAMAGGSDASKIAELUV^V^ 

SPy1390 
Seq ID 210 

MKNSNKLIASNAATLASVMALAACQSTNDNTKVISMKGDTISVSDFYNETKNTEVSQKAMLNLVISRVFEAQYGDK^^ 
AEQYGASFSAALAQSSLTPETFKRQIRSSKLVEYAVKEAAKKELTTQEYKKAYESYTPTIV!AVEMITLDNEETAKSVLEEL1<AEGADFTA 
lAKEKTTTPEKKVTTYKFDSGATNVPTDWKAASSLNEGGlSDVISVLDPTSYQKKFYiVKVTKKAEKKSDWQEYKKR 
NFQNKV1ANALDKANVKIKD1<AFAN1LAQYANLGQKTKAASESSTTSESSKAAEENPSESEQTQTS 

SPy1422 
Seq ID 211 

VLYPTPIAKLIDSYSKLPGIGIKTATRLAFYTIGMSNEDVNDFAKNLLAAKRELTYCSICGNLTDDDPCHICTDTSRDQTTILW 

SAMEKlQEYHGYYHVLHGLISPMNGVGPDDINLKSLITRLMDGKVSEVIVATNATADGEATSMYlSRVLKPAGIK\rrRl^RGL^^ 
YADEVTLLRAIENRTEL 

SPy1436 
Seq ID 212 

MDMSKSNRRTWQGLWILIAILTTFITSTVTAARKIRNFPDTTEILLGTKATETPGILPFTGSYQLVLGDLDNLQRPTFAHIQLKD 
IKRKGLKFNPPGWHNYKLTDANGKTWLMDRGHLVGYQFSGLNDEPKNLVTMTKYLNTGFSDKNPLGMLYYENRLDSWLALHPNF 

WLDYKVTPWHKNELVPRQWLQYVGIDENGDLLQIKLGSEKESVDNFGVTSVTLDNVSPLAELDYQTGMMLDSTQNEEDSNLETEE 

FEEAA 

SPy1494 
Seq ID 213 

MTSKKACLSSIIVLASLTCGNDTVSANHLSATGDKFDDCSTLVEKDVAPKDELEMLAWSSSQTTDDADRDYEDFLDDDSFISQNETDK 
MFENLTDDRLLNELDELDEENEEDEEDTIEPEQNVIMPSDDELFDLTDAVETRLTVSSAPHLEAELPKPHLRSLSDTALRSGEIRGHLD 
NKLDALSWATKLALTMAQKFDLTTHVYSIGESFSEVLAAHYEDRKAESAFSKKKRFHLPIATPDWIEELRRLVSSIGSSKEDVSVPYS 
RKLGMAVAKRKIALPQTGERFSYYPVLLGLMILGLTPIMIPKKINN 

SPy1523 
Seq ID 214 

MAKDKEKQSDDKLVLTEWQKRNIEFLKKKKQQAEEEKKLKEKLLSDKKAQQQAQNASEAVELKTDEKTDSQEIESETTSKPKKTKKV 

RQPKEKSATQIAFQKSLPVLLGALLLMAVSIFMITPYSKKKEFSVRGNHQTNLDELIKASKVKASDYWLTLLTSPGQYERPILRTIPWVK 

SVHLSYQFPNHFLFNVIEFEIIAYAQVENGFQPILENGKRVDKVRASELPKSFLILNLKDEKAIQQLVKQLTTLPKKLVKNIKSVSLANSKT 

TADLLLIEMHDGNWRVPQSQLTLKLPYYQKLKKNLENDSIVDMEVGIYTTTQEIENQPEVPLTPEQNAADKEGDKPGEHQEQTDNDS 
ETPANQSSPQQTPPSPETVLEQAHG 

SPy1536 
Seq ID 215 

MKRLKKIKVWVLVGLLALISLLLALFFPLPYYIEMPGGAYDIRTVLQVNGKEDKRKGAYQFVAVGISRASLAQLLYAWLTPFTEISTAEDT 
TGGYSDADFLRlNQFYMETSQN/Su^lYQALSLAGKPVTLDYKGWVLDVNNESTFKGTLHLADTVTGVNGKQFTSSAELIDYVSHLKLG 
DEVTVQFTSDNKPKKGVGRIIKLKNGKNGIGIALTDHTSVNSEDTVIFSTKGVGGPSAGLMFTLDIYDQITKEDLRKGRTIAGTGTIGKD 
GEVGDIGGAGLKWAAAEAGADIFFVPNNPVDKEIKKVNPNAISNYEEAKRAAKRLKTKMKIVPVTTVQEALWLRK 

SPy1664 
Seq ID 216 

MLEHKIDFMVTLEVKEANANGDPLNGNMPRTDAKGYGVMSDVSIKRKIRNRLQDMGKSIFVQANERIEDDFRSLEKRFSQHFTAKTP 
DKEIEEKANALWFDVRAFGQVFTYLKKSIGVRGPVSISMAKSLEPIVISSLQITRSTNGMEAKNNSGRSSDTMGTKHFVDYGVYVLKG 

E1<L^KYHA^ 

SPy1604 
Seq ID 217 

MATKKVHIISHSHWDREWYMAYEQHHMRLINLIDDLLEVFQTDPDFHSFHLDGQTIILDDYLKVRPEREPEIRQAIASGKLRIGPFYILQ 

DDFLTSSESNVRNMLlGKEDCDRWGASVPLGYFPDTFGNMGQTPQLMLKAGLOMy'^FGRGIRPTGFNNQVDTSEKYSSQFSEISW 

QGPDNSRILGLLFANWYSNGNEIPTTEAEARLFVl/DKKLADAERFASTKHLLMMNGCDHQPVQLDVrKAIALANQLYPDYEFVHSCFE 

DYLADLADDLPENLSTVQGEITSQETDGWYTLANTASARIYLKQANTRVSRQLENITEPLAAMAYEVTSTYPHDQLRYAWKTLMQNH 

PHDSICGCSVDSVHREMMTRFEKAYEVGHYLAKEAAKQIADAIDTRDFPMDSQPFVLFNTSGHSKTSVAELSLTWKKYHFGQRFPKE 

VYQEAQEYLARLSQSFQIIDTSGQVRPEAEILGTSIAFDYDLPKRSFREPYFAIKVRLRLPjTLPAMSWKTLALKLGNETTPSETVSLYD 

DSNQGLENGFLKVMIQTDGRLTITDKQSGLIYQDLLRFEDCGDIGNEYISRQPNHDQPFYADQGTIKLNIiSNTAQVAELEIQQTFAIPIS 

ADKLLQAEMEAVIDITERQARRSQEKAELTLTTLIRMEKNNPRLQFTTRFDNQMTNHRLRVLFPTHLKTDHHLADSIFETVKRPNHPDA 
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TFWKNPSNPQHQECFVSLFDGENGVTIGNYGLNEYEILPDTNTIAITLLRSVGEMGDWGYFPTPEAQCLGKHSLSYSFESITKQTQFA 
STPSQLGM^^ 

SPy1607 
Seq ID 218 

MKITKIEKKKRLYLIELDNDESLYWEDTIVRFMLSKDKVLDNDQLEDMKHFAQLSYGKNLALYFLSFQQRSNKQVADYLRKHEIEEHIIA 
DIITQLQEEQWIDDTKLADTYIRQNQLNGDKGPQVLKQKLLQKGIASHDIDPILSQTDFSQLAQKVSQKLFDKYQEKLPPKALKDKITQA 
LLTKGFSYDLAKHSLNHLNFDQDNQEIEDLLDKELDKQYRKLSRKYDGYTLKQKLYQALYRKGYNSDDINCKLRNYL 

SPy1615 
Seq ID 219 

MICLLCQQISQTPISITEIIFLRRISSPICQQCQKSFQKIGKSVCATGCANSDIIACRDCLKWENKGYNVNHRSLYCYNAAMK^^ 

QGDYLLRKVFAVELADVITKYYKGYIPVPVPVSPGCFRERQFNQVSAILEAANVSYLSLFEKLDNTHQSSRTKKERLLVEKSYRLLKVS 
Nl PDKILIVDDIYTTGSTIIALRKQLAKVANSDIKSLSIAR 

SPy1666 
Seq ID 220 

MKSFSLTFSFLNLLKYGTIKVMTKEFHH\n-VLLHETVDMLDlKPDGIYVDATLGGSGHSAYLLSKLGEEGHLYCFDQDQI<AIDNAQ^^ 
KSYIDKGQVTFIKDNFRHLKARLTALGVDEIDGILYDLGVSSPQLDERERGFSYKQDAPLDMRMDRQSLLTAYEWNTYPFNDLVKI^ 
KYGEDKFSKQIARKIEQARAIKPIETTTELAELIKAAKPAKELKKKGHPAKQIFQAIRIEVNDELGAADESIQDAMELL^^ 
EDRLTKQLFKEASTVDVPKGLPLIPEDMKPKFELVSRKPILPSHSELTANKRAHSAKLRVAKKIRK 

SPy1727 
Seq ID 221 

yTTTEQELTLTPLRGKSGKAYKGTYPNGECVFIKLNTTPILPALAKEQIAPQLLWAKRMGNGD 

ILLRLHKSKKLVNQLLQLNYKIENPYDLLVDFEQNAPLQIQQNSYLO^IVKELKRSLPEFKSEVATIVHGDIKHSNWVITTSGMIFLVDWD 
SVRLTDRMYDVAYLLSHYIPRSRWSEWLSYYGYKNNDKVMQKIIWYGQFSHLTQILKCFDKRDMEHVNQEIYALRKFREIFRKK 

SPy1785 
Seq ID 222 

MILTAPMSNLKGFGPKSAEKFQKLDIYTVEDLLLYYPFRYEDFKSKSVFDLVDGEKAVITGLWTPANVQYYGFKRNRLSFKLRQGEAV 
LNVSFFNQPYLADKIELGQEVAVFGKWDATKSAITGMKVLAQVEDDMQPWRVAQGISQSTLIKAIKSAFEIDAHLELKENLPATLLEKY 

SGAHMNRLLQGDVGSGKWIAGLSMYAAYTAGFQSALMVPTEILAEQHYISLQELFPDLSIAILTSGMKAAVKRWl^ 
GTHALIQDSVQYHKLGLVITDEQHRFGVKQRRIFREKGENPDVLMMTATPIPRTLAITAFGEMDVSNDELPAGRKPIMTRWVKHEQLG 
TVLEWVKGELQKDAQVYVISPLIEESEALDLKNAVALHAELSTYFEGIAKVALVHGRMKNDEKDAIMQDFKDKKSHILVST^^ 
PNATIMIIMDADRFGLSQLHQLRGRVGRGYKQSYAVLVANPKTDSGKKRMTIMTETTDGFVU^ESDLKMRGSGEI^ 
ADIVEDYPILEEARKVSAAIVSDPNWIYEKQWQLVAQNIRKKEVYD r^oov:,I=irv:, ntuc.i:,irt:.-uv 

SPy1798 
Seq ID 223 

MKKISKCAFVAISALVLIQATQTVKSQEPLVQSQLVTTVALTQDNRLLVEEIGPYASQSAGKEYYKHIEKIIVDNDVYEKSLEGERTFDIN 

YQGIKINADLIKDGKHELTIVNKKDGDILITFIKKGDKXn-FISAQKLGTTDHQDSLKKDVLSDKWPQNQGTQKWKSGKNTANLSLlTKL 

SQEDGAILFPEIDRYSDNKQIKALTQQITK\m/NGTWKDLISDSVKDTNGWVSNMTGLHLGTKAFKDGENTlVISSKGFED\n-|TW 

DGQIHFVSAKQKQHNA-AEDRQSTKLDVTTLEKAIKEADAIIAKESNKDAVKDLAEKLQVIKDSYKEIKDSKLLADTHRLLKD^ 

VSINNLTEGTYTLNFKANKENSEESSMLQGAFDKRAKLWKADGTMEISMLNTALGQFLIDFSIESKGTYPAAVRKQVGQKDING^^ 

SEFTMPIDDLDKLHKGAVLVSAMGGQESDLNHYDKYTKLDMTFSKTVTKGWSGYQVETDDKEKGVGTERLEKVLVKLGKDLDGDGK 

LSKTELEQIRGELRLDHYELTDISLLKHAKNITELHLDGNQITEIPKELFSQMKQLRFLNLRSNHLTYLDKDTFKSNAQLRELYLSSNFIH 

SLEGGLFQSLHHLEQLDLSKNRIGRLGDNPFEGLSRLTSLGFAENSLEEIPEKALEPLTSLNFIDLSQNNLALLPKTIEKLRALSTIVASR 

NHITRIDNISFKNLPKLSVLDLSTNEISNLPNGIFKQNNQLTKLDFFNNLLTQVEESVFPDVETLNLDVKFNQIKSVSPKVRALIGQHKLTP 

QKHIAKLEASLDGEKIKYHa^FSLLDLYYWEQKTNSAlDKELVSVEEYQQLLQEKGSDTVSLLNDMQVDWSIVIQLQKKASNGQYVTV 

DEKLLSNDPKDDLTGEFSLKDPGTYRIRKALITKKFATQKEHIYLTSNDILVAKGPHSHQKDLVENGLRALNQKQLRDGIYYLNASMLKT 

DLASESMSNKAINHRVTLWKKGVSYLEVEFRGIKVGKMLGYLGELSYFVDGYQRDLAGKPVGRTKKAEWSYFTDVTGLPLADRY 

KNYPKVLRMKUEO'^KKDGLVPLQVFVPIMDAISKGSGLQTVFMRLDWASLTTEKAKWKETNNPQENSHLTSTDQLKG 

UAGLS^^ 

SPy1801 
Seq ID 224 

MNKNKLLRVAMLLSLLAPTAESMTVU\QDVMLETHKATTNETSDSSSKEENNKN/V\PTTS 

SDSSQSAIGSSDNKAEAENQVDDKSTDHSKSTDHSKPTDQPKPSPSKVDTAPASSLSKQLPEARTPiQSLSPYVSDLDLSEIDIPSVN 

TYAAYVEHWSGKNAYTHHLLSRRYGIKADQIDSYLKSTGIAYDSTRINGEKLLQWEKKSGLDVRAIVAIAMSESSLGTQGIATLLGANM 

FGYAAFDLDPTQASKFNDDSAIVKIVrrQDTIIKNKNSNFALQDLI<AAKFSRGQLNFASDGG\A^FrDTTGSGKRRAQIMEDLDI<^ 

GGTPAIPAELKVQSSASFASVPAGYKLSKSYDVLGYQASSYAWGQCTWYWNRAKELGYQFDPFMGNGGDWKYKVGYALSCT^ 

GYAISFAPGQAGADGTYGHVSIVEDVRKDGSILISESNCIGLGKISYRTFTAQQAEQLTYVIGKSKN 

SPy1813 
Seq ID 225 

MpKHLLVKRTLGCVCAATLMGAALATHHDSLNTVKAEEKTVQVQKGLPSIDSLHYLSENSKKEFKEELSKAGQESQKVKEILAKAQC^ 

DKQAQELAKMKIPEKIPMKPLHGSLYGGYFRmHDKTSDPTEKDKVNSMGELPKEVDL7>iFIFHDVVTKDYSLFWK^ 

GTRVIRTIPWRFLAGGDNSGIAEDTSKYPNTPEGNKALAKAIVDEYVYKYNLDGLDVDVEHDSIPKVDKKEDTAGVERSIQVFEEIGKLI 

GPKGVDKSRLFIMDSTYMADKNPLIERGAPYINLLLVQVYGSQGEKGGWEPVSNRPEKTMEERWQGYSKYIRPEQYMIGFSFYEEN 

AQEGNLWYDINSRKDEDKANGINTDITGTRAERYARWQPKTGGVKGGIFSYAIDRDGVAHQPKKYAKQKEFKDATDNIFHSDYSVSK 

ALKWMLKDKSYDLIDEKDFPDKALREAVMAQVGTRKGDLERFNGTLRLDNPAIQSLEGLNKFKKUKQLDLIGLSRITKLDRSVLPANM 

KPGKDTLEWLETYKKDNKEEPATIPPVSLKVSGLTGLKELDLSGFDRETIJ^GLDAATLTSLEKVDISGNKL^^^ 
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ISNHVGSNEQTVKFDKQKPTGHYPDTYGHn-SLRLPVANEKVDLQSQLLFGTVTNQGTLiNSEADYl^^ 
FKVSYENYTVKV^DSTLGTTTDKTLA^^ 

GSETDNISLGWDSKQSIIFKLKEDGLIKHWRFFNDSARNPETTNKPIQEASLQIFNIKDYNLDNLLENPNKFDDEKYWITVDTYSAQGE 
FQ VTAI N AN AG VLKDCI EKRQLLKK 

SPy1821 
Seq ID 226 

DQYEIPN/ANVEQELLYILENSDVKIQFYGSEVIGXn-VPTTVELTVAETQPSIKGATWGSGK 
TYVSRA 

SPy1916 
Seq ID 227 

MTKTLPKDFIFGGATAAYQAEGATHTDGKGPVAWDKYLEDNYWYTAEPASDFYNRYPVDLKLSEEFGVNGIRISIAWSRIFPTGKGEV 

NPKGVEYYHNLFAEGHKRHVEPFWLHHFDTPEALHSDGDFLNRENIEHFVNYAEFCFKEFSEVNYVVTTFNEIGPIGDGQ^^ 

GIQYDUKKVFQSHHNMMVSHARAVKLFKDSGYSGEIGWHALPTKYPFDANNPDDNmAAELEDIIHNKFILDATYLGKYSDKTMEGVN 

HILEVNGGELDLREEDFAALDAAKDLNDFLGINYYMSDWMQAFDGETEIIHNGKGEKGSSKYQIKGVGRRKAPVDVPKTDWDWIIFP 
g^°Q'^^^KAD^ 

SPy1972 
Seq ID 228 

^!5]^'^y'^.^^P®'^'^^°'^'-'-^'^°'®'^^''^'^®'^^LGCTPSILTHQVAAKTIVGLARDEAQQGDGNA 

PASIAEHHLRMHFKTLPAGESLGSLGLWVWGDVDQPSKDWPNGAITMTKAKKDDYGYYLDVPLAAKHRQQVSYLINNKAGENLSKD 
QHISLLTPKMNEVWIDENYHAHAYRPLKKGYLRINYHNQSGHYDNLAVWTFKDVKTPTTDWPNGLDLSHKGHYGAYVDVPLKEGAN 
E^.^4'r5'^®'^'^°'^'°'''^^*^'''^°Y'-'^*^ELDNHTQVFVKDTDPKWNNPYYIDQVSLKGAEQTTPNEIKAIFTTLDGLDEDAVKQNIKITDKA 

Pi52?J?^'^^°*^°^^'^®°'^'^^'^^^LTSDSVKGISDYTGYYYLYEITRGQEKVMVLDPYAKSLAAWNDATATDDI^ 

kSKl^K?J:^^^'^''J?^'^^°^''^^^"^f^°'^^°'^'-^°KLTHPFGTFSAFVEQLDYLKDLGWHVQLLPVLSYFYANE^ 

^°'i^I!i!)?'?Y°PQ'^Y'^^SGMYSANPNDPALRIAELKNLVNEIHKRGMGVIFDWYNHTARTYLFEDLEPNYYHFMNADGTARESFG 
J?S?Jbw^"^!^^'''^'^^®'''^'-''"''^'''^'^°f''^'^°'^'^<3DHDAAAI^ 

7,^^^IIH?-'^^^'^®™'^'^'^^'^^°'^°^'-'"r<^°°'^'QQE°LIMGYQWASNGDRY^^ 

ssk§25SSal1!K^lliSrq^^^°'"^^^^ 

SPy1979 
Seq ID 229 

5^i5/?I^^'°Y,'^'-'-'^'^^'^'^*^"^^'^^^°^'A®Y®^LPDRPPINNSQLW 

TDNGAMPHKLEKADLLKAlQKQLIANVHSNDGYFEVIDFASDATITDRNGKVYFADKDGSVn-LPTQPVQEFLLKGIWRVRPYKEKPV^ 

HyKNREQAYEINPKTGIKEKTNNTDLVSEKYYVLKQGEKPYDPFDRSHU<LFTIKY\/UVI^NELLKSEQLLTASERNLDFRDL^^ 
AKLLYNNLDAFDIMDYTLTGKVEDNHDKNNRXAmA'MGKRPKGAKGSYHLAYDKDLYTEEERKAYSYL^ 

SPy1983 
Seq ID 230 
[^LTSKHHNLN 

BA2'°S^2Sf^S^°°'^'''^'^°®^''°^'^®^'^®'^'^°^TGAQGPVGPQGEKGETGAQGPAGPQGEAGKPGEQGPAGPQGEAGQPGE 

^^^^Sn^.S^^^QP^^'^P^'^S'^^^PAAEKPADKEANQTPERRNGNMAKTPVANNHRRLPATGEQANPFFTAAAVAVMT^^ 
AVTKRKENN 

SPy1991 
Seq ID 231 

LVa^^^^d °®^'^"^°°''^®'-'''^^'-'^°^'"^^'^'^'^'^2'^°°LPKGFS\n-ARDCDDQEIMAFEHHTL^ 

SPyaOOO 
Seq ID 232 

VSKYLKYFSIITLFLTGLILVACQQQKPQTKERQRKQRPKDELWSMGAKLPHEFDPKDRYGVHNEGNITHSTLLKRSPELDIKGELAK 
TTHLSEDGLTWSFDLHDDFKFSNGEPVTADDVKFTYDMLKADGKAWDLTFIKNVEW 

wbL?¥S!!SI)!)]^/D^^^^ 

SPy2006 
Seq ID 233 

YI^K^y^/'^^^'^'^'-^'^'^'^^^Q^^'^^'^'^'SSATKDNQIAYIDDSKGKAKAPKTNKTMDQISAEEGISAE 

FYNGKVPYDAIISEELLMTDPNYRFKQSDVINEILDGYVIKVNGNYYWLKPGSKRKNIRTKQQIAEQVAKGTK^K^^^ 

o^Y^llY!^^^'^P°°'^'^°°°^'f'®'^™''°'^LGDAYLVPHGNHYHYIPKKDLSPSEL^ 

RKAPIPDWPNPGQGHQPDNGGYHPAPPRPNDASQNKHQRDEFKGKTFKELLDQLHRLDLKYRHN^EDGUFEPTO 
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PHGDHYHIIPRSQLSPLEMELADRYLAGQTEDDDSGSDHSKPSDKEVTHTFLGHRIKAYGKGLDGKPYDTSDAYVFSKESIHSVDKS 

GXn-AKHGDHFHYIGFGELEQYELDEVANWVKAKGCMDELAAALDQEQGKEKPLFDTKKVSRKWKDGKVGYMMPKDGKDYFY^ 

QLDLTQIAFAEQELMLKDKKHYRYDIVDTGIEPRLAVDVSSLPMHAGNATYDTGSSFVIPHIDHIHWPYSWLTRDQIATIKYVMQHPEV 

RPDIWSKPGHEESGSVIPNVTPLDKRAGMPNWQIlHSAEEVQKALAEGRFATPDGYlFDPRDVlJiiKETFVWKDGSFSIPRADGSSLRT 

1NKSDLSQAEWQQAQELU\KKNAGDATDTDKPKEKQQADKSNENQQPSEASKEEEKESDDFIDSLPDYGLDRATLEDHINQLAQKA 

NIDPKYLIFQPEGVQFYNKNGELVTYDIKTLQQINP 

SPy2009 

mrraeI^nkhsrysirklsv/^^ 

aqketykqklktapdkdkllftyhseymtavkdlpastesttqpveapvqetqasasdsm\n-gdsts\m"dspeetpsses 

EAPAQPAESEEPSVAASSEETPSPSTPAAPETPEEP/^APSPSPESEEPSV/JiAPSEETPSPETPEEPAAPSQPAESEESSVAATTSPS 

pstpaesetqtppawkdsdkpssaaekpaasslvseqtvqqptskrssdkkeeqeqsyspnrslsrqvrahesgkylpstgekaq 
plfiatmtlmslfgsllvtkrqketkk 

SPy2010 

LRKKQKLPFDKL^^ 

ktadtpatskatirdlndpsqvktlqekagkgagtwavidagfdknheawrltdotkaryqskedlekakkehgitygewvndk^ 
yyhdyskdgktavdqehgthvsgilsgnapsetkepyrlegampeaqlllmrveivngladyarnyaqaiidavnlgakvinmsfgna 
alayanlpdetkkafdyakskgvsivtsagndssfggktrlpladhpdygwgtpaaadstltvasyspdkqltetatvktadqqdk 
empvlstnrfepnkaydyayanrgmkeddfkdvkgk!auergdidfkdklanak!<agavgvliydnqdkgfpielpnvdqmpaafisr 

KDGLLLKENPQKTITFNATPKVLPTASGTKLSRFSSWGLTADGNIKPDIAAPGQDILSSVANNKYAKLSGTSMSAPLVAGIMGLLQKQY 

ETQYPDMTPSERLDLAKKVLMSSATALYDEDEKAYFSPRQQGAGAVDAKKASAATMY\n-DKDNTSSKVHLNNVSDKFEWVTVHNK 

SDKPQELYYQAWQTDKVDGKLFALAPKALYETSWQKITIPANSSKQWIPIDVSQFSKDLLAPMKNGYFLEGFVRFKQDPTKE^ 

PYIGFRGDFGNLSALEKPIYDSKDGSSYYHEANSDAKDQLDGDGLQFYALKNNFTALTTESNPWTIIKAVKEGVENIEDIESSEITETIFA 

GTFAKQDDDSHYYIHRHANGKPY/\AISPNGDGNRDYVQFQGTFLRNAKNLVAEVLDKEGN\AMn-SEVTEQWKNYNNDLASTLGST 

RFEKTRWDGKDKDGKWANGTYTYRVRYTPISSGAKEQHTDFDVIVDNTTPEVATSATFSTEDRRLTLASKPKTSQPVYRE^ 

MDEDLPTTEYISPNEDGTFTLPEEAETMEGATVPLKMSDFTYWEDMAGNITYTPVTKLLEGHSNKPEQDGSDQAPDKKPETKPEQD 

GSGQAPDKKPETKPEQDGSGQTPDKKPETKPEQDGSGQTPDKKPETKPEKDSSGQTPGKTPQKGQPSRTLEKRSSKRALATKAST 

KDQLPTTNDKDTNRLHLLKLVMTTFFLGLVAHIFKTKRTED 

SPy2016 

MNlRNKflNSKTLLFTSLVAVALLGATQPVSAETYT 

DKEEWPEDWPEDDWSSDKKDETEDKTRPPYGEALGTGYEKRDDWGGPGTVATDPYTPPYGGALGTGYEKRDDWGGPGTVATDP 

YTPPYGGALGTGYEKRDDWRGPGHIPKPENEQSPNPLHIPEPPQIEWPQWNGFDGLSFGPSDWGQSEDTPPSEPRVPEKPQHTP 

QKNPQESDFDRGFSAGLKAKNSGRGIDFEGFQYGGWSDEYKKGYMQAFGTPYTPSAT 

SPy2018 
MAKNNTN^ 

FKRAEELEKAKQALEDQRKDLETKLKELQQDYDLJ^KESTSWDRQRLEKELEEKKEALEU^IDQASRDYHRATALEKELEEKKI<^ 

IDQASQDYNRANVLEKELETlTREQElNRNLLGNAKLELDQLSSEKEQLTlEKAKLEEEKQiSDASRQSLRRDLDASREAKKW^ 

NLTAELDKVKEDKQISDASRQGLRRDLDASREAKKQVEKDLANLTAELDKVKEEKQISDASRQGLRRDLDASREAKKQVEKALEEAN 

SKLAALEKLNKELEESKKLTEKEKAELQAKLEAEAKALKEQLAKQAEEU\KLR^^ 
NKAPMKETKRQLPSTGETANPFFTAAALTVMATAGV/VkWKRKEEN 

SPy2025 
SsQ ID 238 

MKKRKLLAWLLSTILLNSAVPLWADTSLRNSTSSTDQPTTADTDTDDESETPKKDKKSKETASQHDTQKDHKPSHTHPTPPSNDTK 

QTDQASSEATDKPNKDKNDTKQPDSSDQSTPSPKDQSSQKESQNKDGRPTPSPDQQKDQTPDKTPEKSADKTPEKGPEKATDKTP 

EPNRDAPKPIQPPUW^PVFIPWRESDKDLSKLKPSSRSS/V\YVRH\ArrGDSAYTHNLLSRRYGITAEQLDGFLNSLGIHYDKERLN 

RLLEWEKLTGLDVRAIVAIAMAESSLGTQGVAKEKGANMFGYGAFDFNPNNAKKYSDEVAIRHMVEDTIIANKNQTFERQDLKAKKW 

SLGQLDTLIDGGWFTDTSGSGQRRADlMTKLDQWIDDHGSTPEIPEHLKiTSGTQFSEVPVGYKRSQPQNVLTYKSETYSFGC^^^ 

YAYNRVKELGYQVDRYMGNGGDWQRKPGFVTTHKPKVGYWSFAPGQAGADATYGHVAWEQIKEDGSILISESNVMGLGTISYRT 

FTAEQASLLTYWGDKLPRP 

SPy2039 
MN^KKKLG^ 

IVSGDKRSPEILGYSTSGSFDANGKENIASFMESYVEQIKENIKKLDTTYAGTAEIKQPWKSLLDSKGIHYNQGNPYNLIJP^^ 
EQSFVGQHAATGCVATATAQIMKYHNYPNKGLKDYTYTLSSNNPYFNHPKNLFAAISTRQYNWNNILPTYSGRESNVQKMAISELMA 
DVGISVDMDYGPSSGSAGSSRVQRALKENFGYNQSVHQINRGDFSKQDWEAQIDKELSQNQPWYQGVGKVGGHAFVIDGADGRN 
FYHVNWGWGGVSDGFFRLDALNPSALGTGGGAGGFNGYQSAWGIKP 

SPy2043 

mnIusII^rvfskkgrlvkfsmx^^^^ 

LFPKAGDILYSKLDELGRTRTARGTLTYANVEGSYGVRQSFGKNQNPAGWTGNPNHVKYKIEWLNGLSYVGDFWNRSHLIADSLGG 
DALRVNAWGTRTQNVGGRDQKGGMRYTEQRAQEWLEANRDGYLYYEAAPIYNADELIPRANAA/SMQSSDNTINEKVLWN^^ 

TINYHNGTPTQK 
SPy2059 

MF^FI^LLQKKFFPKAYQ^ 
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SPQRPIRRWRRYHIGKLLMILIGTLVLLLGSYLFYLSKTAKVSDLQDALKATWIYDHKGEYAGSLSGQKGSYVELNAISDDLE^^^ 

EDRTFYSNSGINLKRFLLAVVrAGRFGGGSTITQQU\KNAYLSQDQTlKRKAREFFLALELTKKYSKKD!LTMYLNNS 

DASOKYFGTTAANLTLDEAATLAGMLKGPEIYNPYHSLKNATHRRDTVLGAMVDAKKITQTKAQQARAVGLKNRLADTYVGKTDDYK 

YPSYFDAVISEAIATYGLSEKDIVNNGYKVYTELDQNYQTGMQTTFNNDELFPVSAYDGSSAQAASVALDPKTGGVRGLIGRVNSSEN 

PTFRSFNYATQAKRSPASTIKPLWYAPAVASGWSIEKELPNTVQDFDGYQPHNYGNYESEDVPMYQALANSYNIPAVSTLNDIGIDK 

AFTYGKTFGLDMSSAKKELGVALGGSXnTNPLEMAQAYAAFANNGVIHPAHLINRlENARGEVLKTFTDKAKRWSQSVADKMTAMM 

LGTFSNGTAVNANWGYTU\GKTGTTETNFNPDLAGDQWVlGYTPDWlSQWVGFNQTDENHYLTDSSAGTASAIFSTa^SYlLP\^^ 

GSQFHVDNAYAQNGISAVYGVNETGNQSGVDTQSIIDGLRKSAQEASQSLSKAVDQSGLRDKAQSIWKEIVDYFR 

SPy2110 
Sea ID 242 

MVSLEEDKNn-VQPDlKViKRDGRLVNFDSTKIYSALLKASMKVTRMSPLVEAKLEAISDRllAEllERFPTOIKlYE 

AKEYINYRTQRDFARSQATDINFSIDKLINKDQTWNENANKDSDVFNTQRDLTAGIVGKSIGLKMLPSHVANAHQKGDIHYHDLDYSP 

YTPMTNCCLIDFKGMUXNGFKIGNAEVESPKSIQTATAQISQIIANIVASSQYGGCTADRIDEFUKPYAELNFKKHMADAKK^^ 

SYAFEKTQKDIYDAMQSLEYEINTLFTSNGQTPFTSLGFGLGTSWFEREIQKAILTIRINGLGSEHRTAIFPKLIFTVKRGLNLEPDSPNY 

DlKTLALECATKRMYPDMLSYDKIIDLTGSFKSPMGCRSFLQGWKDENGQDWSGRMNLGXAn-LNLPRIAMESNGDMDKW^ 

RMLISKDALIYRVERWEAKPANAPILYQYGAFGKRLEKTGNVNDLFKNRRATVSLGYIGLYEVASVFYGGQWEGNPDA1<AFTLSIV(<A 

MKQACEDWSDEYGYHFSVYSTPSESLTDRFCRLDTEKFGIVTDITDKEYYTNSFHYDVRKSPTPFEKLDFEKDYPEAGASGGFIHYG 

EYPVLQQNPKALEANAA^DYAYDRVGYLGTNTPlDKCYNCQFEGDFTPTERGFTCPNCGNNDPKTVDWKRTCGYLGNPm^ 

RHKEISARVKHMNGSTIKYPGL 

SPy2127 
Sea ID 243 

mf3=^nysrvidelrtdyglnlvaigqrlgtdprtvgkvvwqgkhnpnqesrkklnrlyrevketmmtqvn!feeandntkqvmq 

TNFHGQPLDIYGDIQEPLFLARAVAEMIDYTKTSQGYYDVQAMLRKVDEDEKLKGMALEGTTKNFRSGQKNAA/FLTEHGLYEVLMRSN 

kpkakefrkavknilkeirlngyymqgelvqelaqpstqklpgisdltyilnkladlvdmdnijvdis 

SPy2191 
Sea ID 244 

mfkkenlkqryfnfglvalaltilaiifafssknadtksyakkseskmvtidkapknnhaitkeeskekaksiasepipw 

EVPWQQEWQTVQQVSSVAYNPNNWLSNGNTAGIVGSQ/W^QMAAATGVPQSTWEHilARESNGNPNAANASGASGLFQTMPG 
WGSTATVEDQVNAALKAYSAQGLSAWGY 

SPy221 1 
Sea ID 245 

MKNNNKWIIAGLASFLFPLSIIFIILLSMGIYYNSDKTILASDAFHQYVIFAQNFRNIMHGSDSFFYTFTSGLGINFYALMCYYLGSFFSPLL 

FFFNLTSMPDAIYLFTLiKFGLIGL/KACYSFHRLYPKISAFLM!SISVFYSLMSFLTSQMELNSWLDVFlLLPLViLGLNKLITENKTRTYYLS 

ISLLFIQNYYFGYMIALFCILYALVCLLRLNDFNKMFIAFVRFTAVSIC/^ALTSALVILPTYLDLSTYGENLSPIKQLVTNNAWFLDIPAKLS! 

GVYDTTKFNALPMIYVGLFPLMLSVIYFTLESIPLKIKLANACLLTFIIISFYLQPLDLFWQGMHSPNMFLHRYAWSFSIVILLLACETLSRL 

KEVTQlKAGFAFIFLIlLTSLPYSFSQQYNFLPLTLFLLSVFLLLGYTiSLFSFRNSQIPSTFiSAFILlFSLLESGLNTYYQLQGINKEWGFPS 

RQIYNSQLKDINNLVNSVSKNSQPFFRMERLLPQTGNDSMKFNYYGISQFSSVRNRLSSSLLDRLGFQSKGTNLNLRYQNNTIIMDSL 

LGiKYNLSEGPPNKFGFTKLKTSGNTTLYQNHYSSPLAILTRNVYKDVNLNVNTLDNQTKLLNQLSGKSLTYFNLQPAQLISGANQFNG 

QISAQASDYQNSVTLNYQINIPKHSQLYVSIPNIIFSNPDAKEMRIQTDNHNFIYTTDNAYSFFDLGYFADAKVATFSFVFPKNKQISFKE 

PHFYSLSIESYLEAMNSIKQKNVHTYAKSNTVITDYNSKTKGSLIFTLPYDKGWSAQKDGKNLPVKKAQGGFLSVriPKGKGRVILTFIP 

NGFKLGLSLSCVGIIAYMLLYKYIDIKSKLL 

ARF0450 
Seq ID 246 

fsrflptrrdysslwsascrnehynsqhhhgvgtvsskqnprpi 

ARF0569 
Seq ID 247 
sflwekrnpegs 

ARF0694 
Seq ID 248 

kgeektevtkekllelarwikdisddtdektedeayydgdgteettv 

ARF0700 
Seq ID 249 

lyqkkiikksqrisklmtsrilnkalmtskie 

ARF1007 
Seq ID 250 
fvlqkyslwq 

ARF1145 
Seq ID 251 

pismqkaiqviaimveqn\wllskrlllnvlrnfsvphlprnfkpipevkpmqqn\wp 

ARF1208 
Seq ID 252 

frnifcdfsccsslvscyqklkrkgynrtskkrfl 

ARF1262 
Seq ID 253 
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vwtgckwwffc 

ARF1294 
Seq ID 254 

Imlkakktnsklvtlsqptkkfniqklfnqtnllkplslwvllqttis 

ARF1316 
Seq ID 255 

pmgwgrferyerlgrtrhdhvncysrngicspss 

ARF1352 
Seq ID 256 

Imncpslhflqpkhkeqpvlkmlknyeskkqivfk 

ARF1481 
Seq ID 257 

ktlkcnylkrpklvesrlqrtrfrricsrkhgryrrwirrflifltnkskkilvkrrvkrll^ 

ARF1557 

Seq ID 258 

grrlpprlpqekskwilpy 

ARF1629 
Seq ID 259 

Iwspgsryfvrdandcqrtgfskcdfswgtkcryflkfagfssvrknvsivntgcwsgrpcp 

ARF1654 
Seq ID 260 

cvpsvkcsimlqintplsilfpntlvqagvifrvypigfplfllewqksqq 

ARF2027 
Seq ID 261 

sdyfrhhapflkwlrsaknnskdircpyyyangir 

ARF2093 
Seq ID 262 

llllkqtsklnlllkanqkrsgtkssqvkwtasclttlkltkltlflhkftswttakllkltltq 

ARF2207 
Seq ID 263 

hlfvkdvwstlkiwercsvcykkvvkkqeiwqprlyqk 

CRF0038 

Seq ID 264 

lyapqslsnpfldslpcdq 

CRF0122 
Seq ID 265 

nrrdfldnwnllnrteplgtslvfdnfinraahlninnlglgiglnelsctfqpfcvttk 

CRF0406 

Seq ID 266 

qaplddhhnkptywsgyl 

CRF0416 
Seq ID 267 

yflkktplkaakswllspfgemaktgfpwaffsklnlpsaflkvpsvcrplselvlvdtlvseprvtspvkclpt 

CRF0507 
Seq ID 268 

sknkrntdtgcnngkgsksvshnhsknnhakhlsenakeaaisrynlpnersnhitntsscknss 

CRF0549 
Seq ID 269 

Iflhrsrllldflvinfslfvqiyddflng 

CRF0569 
Seq ID 270 
sfiwekrnpegs 

CRF0e28 
Seq ID 271 

ikhltqakqrmpvskvlvanplgskgiadsiqlrmkpiavkryrsslstr 

CRF0727 
Seq ID 272 
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ppppnl ppacktvktvstagrpvlawistglprpssttvm elsasiktli 

CRF0742 
Seq ID 273 

enefnqyyqdakykshkerltintfkrqgylr 

CRF0784 
Seq ID 274 

trvapyfpqalalfsepvtliasfkafpsssaastav 

CRF0854 
Seq ID 275 

spftksvltpsaiklssikrpvkppknpvavlamskrckmrltlmplppanlsslamrftdprfiwslltiksiegfkvtv 

CRF0876 
Seq ID 276 

dhniywhyqiklsqvqtmtfppi 

CRF0907 

Seq ID 277 

pnlldhflpnnphqnhkakid 

CRF0979 
Seq ID 278 

qcliiinninfrhnkntpscilkraslhdlifhaetllw 

CRF1068 
Seq ID 279 

eklfrtarqrynfkvwskkqimglvli\rflksrnrnsesklffyhlfghidlsnasinnnqvrslqslimptvpskkhfsht^ 
k 

CRF1152 
Seq ID 280 

dncrgclstnldnkthsftarfvtnccntf 

CRF1203 
Seq ID 281 

Igsinqlvsfalvtmdevktlfktlltptfeec 

CRF1225 
Seq ID 282 

yqvcqlpegvpqlivnadqtwfykaktkqrtkkqwrsnq 

CRF1236 
Seq ID 283 
ilakieisqktlltlilmrs 

CRF1362 
Seq ID 284 

fpdvndkvratxisrffgkertcllsnrfwevgdkkvetnpnndlgd 

CRF1524 
Seq ID 285 

kfnvtllplcqnelvitlflfifchllllnrganlssqkvikevr 

CRF1525 
Seq ID 286 

etsallaarflrsadspklliclspkrsgtkssnslprsrppaikikrllskpdkaf 
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Invention: 1; claims 1,2, 5-11, 14"-37(all in part, as applicable) 

A nucleic acid encoding a hyperinmune serum reactive 
antigen, a hyperimmune serum reactive antigen, a fragment of 
said hyperimnune serum reactive antigen; an antibody against 
said hyperimmune serum reactive antigen or fragment; a 
process for producing said nucleic acid, hyperimmune serum 
reactive antigen, fragment or antibody; a pharmaceutical 
composition comprising said nucleic acid, hyperimmune serum 
reactive antigen, fragment or antibody; methods of 
identifying an agonist or antagonist; methods of diagnosis, 
uses of the nucleic acid, the hyperimmune serum reactive 
antigen or fragment in the manufacture of an aptamer, 
spiegelmer, ribozyme, antisense oligonucleotide or siRNA, 
all of them relating to the nucleic acid of SEQ ID NO: 1, 
the hyperimmune serum reactive antigen of SEQ ID NO: 151 and 
the fragment comprising amino acids 4-44 of SEQ ID NO: 151. 



Inventions: 2-150; claims: 1-37 (all in part and as applicable) 

A nucleic acid encoding a hyperinmune serum reactive 
antigen, a hyperimnune serum reactive antigen, a fragment of 
said hyperimmune serum reactive antigen; an antibody against 
said hyperimmune serum reactive antigen or fragment; a 
process for producing said nucleic acid, hyperimmune serum 
reactive antigen, fragment or antibody; a pharmaceutical 
composition comprising said nucleic acid, hyperimmune serum 
reactive antigen, fragment or antibody; methods of 
identifying an agonist or antagonist; methods of diagnosis, 
uses of the nucleic acid, the hyperimmune serum reactive 
antigen or fragment in the manufacture of a medicament, an 
aptamer, spiegelmer, ribozyme, antisense oligonucleotide or 
siRNA, all of them relating to the nucleic acid of SEQ ID 
NOs: 2-150 and the polypeptides encoded by said nucleic 
acid, SEQ ID NO: 152-300, respectively 



Inventions: 151-172; claims 1,2, 5-11, 14-37(an in part, as 
appl i cab! e) . 

As for invention 1, all relating to a fragment of SEQ ID NO: 
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57-65 (Invention 151), 67-98 (Inv. 152), 101-107, 109-125, 
131-144, 146-159, 168-173, 181-186 , 191-209, 296-213, 
229-245, 261-269, 288-301 , 304-317, 323-328, 350-361, 
374-384, 388-407 , 416-425, 1-114, 18-33 (Inv. 171) and 
62-72 (Inv. 172) of SEQ ID NO: 151, respectively. 
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